All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Ultra5 successful install - PGX64 issues
       [not found]         ` <de1ac156-926a-c187-f15e-2c3da9251c82@web.de>
@ 2018-04-15  8:34           ` Helge Deller
  2018-04-19 19:29             ` Frank Scheiner
  0 siblings, 1 reply; 15+ messages in thread
From: Helge Deller @ 2018-04-15  8:34 UTC (permalink / raw)
  To: Frank Scheiner, Dennis Clarke; +Cc: debian-hppa, linux-parisc

On 14.04.2018 20:13, Frank Scheiner wrote:
> On 04/14/2018 06:11 PM, Dennis Clarke wrote:
>> Really?  Well then .. let me see what I have that is ancient in the
>>   warehouse.
>>
>> How about PA-RISC?  I happen to have some superdomes kicking about but they require truely a ton of power to operate.
> 
> I assume hppa people in Debian (debian-hppa@l.d.o in CC) would appreciate testing on such gear.
> Not sure if those superdomes will work out of the box though. 

It really would be interesting if Linux can boot on such machines.
If they don't, I'm pretty sure that I can finish the firmware support in Linux to be able to boot in a cell. For that I'd need access to such a machines via ssh (to a x86 machine for cross-compiling/tftpboot provisioning) & a serial port to the superdome.

> I know from my own testing that the following "smaller" machines work with Debian GNU/Linux Sid for hppa:
> 
> * 712/80
> * c3700, c3750, J5600, rp2470
> * c8000, rp3440
> 
> Apart from the rp3440 - and maybe also the 712/80 which showed some issue with it's built-in NIC after netbooting the Linux kernel and the OS

What kind of problems?

> - all machines also work diskless, which could speed up testing for you and avoid a manual Debian installation - although this could still be interesting.

Helge

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-15  8:34           ` Ultra5 successful install - PGX64 issues Helge Deller
@ 2018-04-19 19:29             ` Frank Scheiner
  2018-04-20  6:37               ` Helge Deller
  2018-04-20  9:24               ` Jeroen Roovers
  0 siblings, 2 replies; 15+ messages in thread
From: Frank Scheiner @ 2018-04-19 19:29 UTC (permalink / raw)
  To: Helge Deller; +Cc: debian-hppa, linux-parisc, Dennis Clarke

Hi,

and sorry for the delay, I was a little short of spare time this week. :-/

On 04/15/2018 10:34 AM, Helge Deller wrote:
> On 14.04.2018 20:13, Frank Scheiner wrote:
>> I know from my own testing that the following "smaller" machines work with Debian GNU/Linux Sid for hppa:
>>
>> * 712/80
>> * c3700, c3750, J5600, rp2470
>> * c8000, rp3440
>>
>> Apart from the rp3440 - and maybe also the 712/80 which showed some issue with it's built-in NIC after netbooting the Linux kernel and the OS
> 
> What kind of problems?

Unfortunately I seem to not have made any notes for the issue with the 
712/80, so I retried with the assumed issue creating configuration 
earlier this week:

This configuration was using a Debian Linux kernel 4.9.25-1 
(4.9.0-3-parisc from 2017-05-02). And when netbooting it, shortly after 
login the machine seems to loose contact to the NFS server:

```
[...]
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.

Debian GNU/Linux buster/sid hp-712 ttyS0

hp-712 login: root
Password:
Last login: Thu Sep 18 11:30:50 CET 1902 from 172.16.1.1 on pts/0
Linux hp-712 4.9.0-3-parisc #1 Debian 4.9.25-1 (2017-05-02) parisc

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.

[  232.973913] nfs: server 172.16.0.2 not responding, still trying
[  233.094265] nfs: server 172.16.0.2 not responding, still trying
[  233.205127] nfs: server 172.16.0.2 not responding, still trying
[  233.568429] nfs: server 172.16.0.2 not responding, still trying
[  233.692383] nfs: server 172.16.0.2 not responding, still trying
[  233.808818] nfs: server 172.16.0.2 not responding, still trying
[...]
[  235.179253] nfs: server 172.16.0.2 OK
[  235.251896] nfs: server 172.16.0.2 not responding, still trying
[...]
```

Although it seems to be able to reconnect from time to time, the machine 
is not accessible.

Afterwards I found some older notes about this machine which mention no 
issues during diskless operation with the very same configuration 
(kernel and possibly also userland), which made me wonder, if there's 
maybe an issue between the machine's built-in NIC and my used 1000 Mbit 
network switch. And indeed, when connecting another 100 Mbit network 
switch in between the 712/80 and the 1000 Mbit network switch the issue 
seemed to be gone and the machine stayed accessible .

But later this week I retried the 712/80 with the current Linux kernel 
(4.15.x) and Debian userland and the issue hit me again, although much 
later and despite the 100 Mbit network switch in between. Looking at it 
I could see that the collision indicator was active on the switch for 
the port used by the 712/80. I then configured a singular port of the 
1000 Mbit network switch to 10 Mbit full duplex and attached the 712/80 
to it. And then the issue again seemed to be gone. But trying to install 
a package or updating the package cache again quickly triggered it. Well 
that's not that of an issue, as I can do the package management for the 
712/80 with another machine (e.g. c8000).

Also interesting, the kernel messages for 4.15.11, please notice the 
time difference between "random: crng init done" and "Key type 
asymmetric registered":

```
[    0.000000] Linux version 4.15.0-2-parisc 
(debian-kernel@lists.debian.org) (gcc version 7.3.0 (Debian 7.3.0-12)) 
#1 Debian 4.15.11-1 (2018-03-20)
[    0.000000] unwind_init: start = 0x1086e8b4, end = 0x108c5644, 
entries = 22233
[    0.000000] FP[0] enabled: Rev 1 Model 13
[    0.000000] The 32-bit Kernel has started...
[...]
[    9.919844] workingset: timestamp_bits=14 max_order=15 bucket_order=1
[   10.168866] zbud: loaded
[   56.112387] random: crng init done
[  433.392379] Key type asymmetric registered
[  433.445502] Asymmetric key parser 'x509' registered
[...]
[  544.565451] systemd[1]: Detected architecture parisc.

Welcome to Debian GNU/Linux buster/sid!
[...]
[  OK  ] Started Serial Getty on ttyS0.
[  OK  ] Started Getty on tty1.
[  OK  ] Reached target Login Prompts.

Debian GNU/Linux buster/sid hp-712 ttyS0

hp-712 login:

```

...On first try I assumed the machine or the kernel would hang, but no, 
it was still working all the time.

Today I tested it again (with 4.15.11) and the issue this time hit me 
already during login, after I entered the username.

So I'm actually back at where I'm started. :-(

I suspect that maybe the built-in 82596 NIC cannot cope with the amount 
of traffic that happens during diskless operation - although I then 
wonder why it doesn't have a problem during the TFTP operation to load 
the lifimage. Next thing I'll examine will be the parameters used for 
the NFS mount (especially for rsize and wsize) - if I ever can login to 
it again :-). And maybe a fan for the passive heat sink of the CPU which 
gets quite hot during operation.

Any suggestions on where to look else?

****

For the rp3440 I (also) have to retract my earlier statement as it looks 
like my second rp3440 actually **works** diskless. I have to retest with 
my first rp3440 (currently in storage) as it seems it behaves 
differently in this regard - or maybe I misconfigured something there in 
the past. I have to recheck.

But for my second rp3440 I still had to blacklist the `radeon` module to 
achieve this, as otherwise the system (console) seems to crash shortly 
before the login prompt would have appeared or just after. This is my 
used kernel command line as configured with palo 1.99 and Linux 4.14.x:

```
Current command line:
0/vmlinux HOME=/ root=/dev/nfs ip=:::::enp32s2:dhcp 
modprobe.blacklist=radeon initrd=0/ramdisk TERM=vt102 console=ttyS0
  0: 0/vmlinux
  1: HOME=/
  2: root=/dev/nfs
  3: ip=:::::enp32s2:dhcp
  4: modprobe.blacklist=radeon
  5: initrd=0/ramdisk
  6: TERM=vt102
  7: console=ttyS0
```

Interestingly after upgrading all packages (obviously including palo) on 
the NFS root FS and building a new lifimage with Linux 4.15.x, 
blacklisting the radeon module seems to be no longer required. Not sure 
if this is due to palo 2.00 or Linux 4.15.x. Anyways the radeon module 
is no longer loaded automatically with this configuration.

****

So actually at least also the rp3440 can work diskless - good that you 
asked, Helge. :-)

Cheers,
Frank

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-19 19:29             ` Frank Scheiner
@ 2018-04-20  6:37               ` Helge Deller
  2018-04-21 19:12                 ` John David Anglin
  2018-04-20  9:24               ` Jeroen Roovers
  1 sibling, 1 reply; 15+ messages in thread
From: Helge Deller @ 2018-04-20  6:37 UTC (permalink / raw)
  To: Frank Scheiner; +Cc: debian-hppa, linux-parisc, Dennis Clarke

On 19.04.2018 21:29, Frank Scheiner wrote:
>>> Apart from the rp3440 - and maybe also the 712/80 which showed some issue with it's built-in NIC after netbooting the Linux kernel and the OS
>>
>> What kind of problems?
> 
> Unfortunately I seem to not have made any notes for the issue with the 712/80, so I retried with the assumed issue creating configuration earlier this week:
> 
> This configuration was using a Debian Linux kernel 4.9.25-1 (4.9.0-3-parisc from 2017-05-02). And when netbooting it, shortly after login the machine seems to loose contact to the NFS server:
> 
> ```
> [...]
> [  OK  ] Started Serial Getty on ttyS0.
> [  OK  ] Started Getty on tty1.
> [  OK  ] Reached target Login Prompts.
> 
> Debian GNU/Linux buster/sid hp-712 ttyS0
> 
> hp-712 login: root
> Password:
> Last login: Thu Sep 18 11:30:50 CET 1902 from 172.16.1.1 on pts/0
> Linux hp-712 4.9.0-3-parisc #1 Debian 4.9.25-1 (2017-05-02) parisc
> 
> The programs included with the Debian GNU/Linux system are free software;
> the exact distribution terms for each program are described in the
> individual files in /usr/share/doc/*/copyright.
> 
> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
> permitted by applicable law.
> 
> [  232.973913] nfs: server 172.16.0.2 not responding, still trying
> [  233.094265] nfs: server 172.16.0.2 not responding, still trying
> [  233.205127] nfs: server 172.16.0.2 not responding, still trying
> [  233.568429] nfs: server 172.16.0.2 not responding, still trying
> [  233.692383] nfs: server 172.16.0.2 not responding, still trying
> [  233.808818] nfs: server 172.16.0.2 not responding, still trying
> [...]
> [  235.179253] nfs: server 172.16.0.2 OK
> [  235.251896] nfs: server 172.16.0.2 not responding, still trying
> [...]
> ```
> 
> Although it seems to be able to reconnect from time to time, the machine is not accessible.
> 
> Afterwards I found some older notes about this machine which mention
> no issues during diskless operation with the very same configuration
> (kernel and possibly also userland), which made me wonder, if there's
> maybe an issue between the machine's built-in NIC and my used 1000
> Mbit network switch. And indeed, when connecting another 100 Mbit
> network switch in between the 712/80 and the 1000 Mbit network switch
> the issue seemed to be gone and the machine stayed accessible .
> 
> But later this week I retried the 712/80 with the current Linux
> kernel (4.15.x) and Debian userland and the issue hit me again,
> although much later and despite the 100 Mbit network switch in
> between. Looking at it I could see that the collision indicator was
> active on the switch for the port used by the 712/80. I then
> configured a singular port of the 1000 Mbit network switch to 10 Mbit
> full duplex and attached the 712/80 to it. And then the issue again
> seemed to be gone. But trying to install a package or updating the
> package cache again quickly triggered it. Well that's not that of an
> issue, as I can do the package management for the 712/80 with another
> machine (e.g. c8000).
> 
> Also interesting, the kernel messages for 4.15.11, please notice the
> time difference between "random: crng init done" and "Key type
> asymmetric registered":
Seems to be a generic issue.
https://www.linuxquestions.org/questions/showthread.php?p=5803405#post5803405

My assumption is, that the kernel waits until it has
enough randomness for the various encryption algorithms. 

> 
> ```
> [    0.000000] Linux version 4.15.0-2-parisc (debian-kernel@lists.debian.org) (gcc version 7.3.0 (Debian 7.3.0-12)) #1 Debian 4.15.11-1 (2018-03-20)
> [    0.000000] unwind_init: start = 0x1086e8b4, end = 0x108c5644, entries = 22233
> [    0.000000] FP[0] enabled: Rev 1 Model 13
> [    0.000000] The 32-bit Kernel has started...
> [...]
> [    9.919844] workingset: timestamp_bits=14 max_order=15 bucket_order=1
> [   10.168866] zbud: loaded
> [   56.112387] random: crng init done
> [  433.392379] Key type asymmetric registered
> [  433.445502] Asymmetric key parser 'x509' registered
> [...]
> [  544.565451] systemd[1]: Detected architecture parisc.
> 
> Welcome to Debian GNU/Linux buster/sid!
> [...]
> [  OK  ] Started Serial Getty on ttyS0.
> [  OK  ] Started Getty on tty1.
> [  OK  ] Reached target Login Prompts.
> 
> Debian GNU/Linux buster/sid hp-712 ttyS0
> 
> hp-712 login:
> 
> ```
> 
> ...On first try I assumed the machine or the kernel would hang, but no, it was still working all the time.
> 
> Today I tested it again (with 4.15.11) and the issue this time hit me already during login, after I entered the username.
> 
> So I'm actually back at where I'm started. :-(
> 
> I suspect that maybe the built-in 82596 NIC cannot cope with the
> amount of traffic that happens during diskless operation - although I
> then wonder why it doesn't have a problem during the TFTP operation
> to load the lifimage.
When loading via TFTP not much traffic is generated.

> Next thing I'll examine will be the parameters used for the NFS mount
> (especially for rsize and wsize) - if I ever can login to it again
> :-). And maybe a fan for the passive heat sink of the CPU which gets
> quite hot during operation.
> 
> Any suggestions on where to look else?

Not really.


> 
> ****
> 
> For the rp3440 I (also) have to retract my earlier statement as it
> looks like my second rp3440 actually **works** diskless. I have to
> retest with my first rp3440 (currently in storage) as it seems it
> behaves differently in this regard - or maybe I misconfigured
> something there in the past. I have to recheck.
> 
> But for my second rp3440 I still had to blacklist the `radeon` module
> to achieve this, as otherwise the system (console) seems to crash
> shortly before the login prompt would have appeared or just after.
> This is my used kernel command line as configured with palo 1.99 and
> Linux 4.14.x:
> 
> ```
> Current command line:
> 0/vmlinux HOME=/ root=/dev/nfs ip=:::::enp32s2:dhcp modprobe.blacklist=radeon initrd=0/ramdisk TERM=vt102 console=ttyS0
>  0: 0/vmlinux
>  1: HOME=/
>  2: root=/dev/nfs
>  3: ip=:::::enp32s2:dhcp
>  4: modprobe.blacklist=radeon
>  5: initrd=0/ramdisk
>  6: TERM=vt102
>  7: console=ttyS0
> ```
> 
> Interestingly after upgrading all packages (obviously including palo)
> on the NFS root FS and building a new lifimage with Linux 4.15.x,
> blacklisting the radeon module seems to be no longer required. Not
> sure if this is due to palo 2.00 or Linux 4.15.x. Anyways the radeon
> module is no longer loaded automatically with this configuration.

There were two issues fixed regarding rp3440.
1. The radeon module on the management board is automatically
disabled by the Linux kernel. This fixes crashes/hangs.
2. The serial port on the management board is disabled by the
Linux kernel.
-> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bcf3f1752a622f1372d3252d0fea8855d89812e7

Older versions of palo tried to work around problem #2 by 
giving kernel parameter "console=ttyS1" to the Linux kernel when
booting.
So, since you upgraded palo and kernel both workarounds aren't
necessary any longer and rp-class machines should work without
any further quirks.

Helge

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-19 19:29             ` Frank Scheiner
  2018-04-20  6:37               ` Helge Deller
@ 2018-04-20  9:24               ` Jeroen Roovers
  2018-04-21  0:22                 ` John David Anglin
  2018-04-22 19:17                 ` Frank Scheiner
  1 sibling, 2 replies; 15+ messages in thread
From: Jeroen Roovers @ 2018-04-20  9:24 UTC (permalink / raw)
  To: Frank Scheiner; +Cc: Helge Deller, debian-hppa, linux-parisc, Dennis Clarke

On Thu, 19 Apr 2018 21:29:45 +0200
Frank Scheiner <frank.scheiner@web.de> wrote:

> Afterwards I found some older notes about this machine which mention
> no issues during diskless operation with the very same configuration 
> (kernel and possibly also userland), which made me wonder, if there's 
> maybe an issue between the machine's built-in NIC and my used 1000
> Mbit network switch. And indeed, when connecting another 100 Mbit
> network switch in between the 712/80 and the 1000 Mbit network switch
> the issue seemed to be gone and the machine stayed accessible .
> 
> But later this week I retried the 712/80 with the current Linux
> kernel (4.15.x) and Debian userland and the issue hit me again,
> although much later and despite the 100 Mbit network switch in
> between. Looking at it I could see that the collision indicator was
> active on the switch for the port used by the 712/80. I then
> configured a singular port of the 1000 Mbit network switch to 10 Mbit
> full duplex and attached the 712/80 to it. And then the issue again
> seemed to be gone. But trying to install a package or updating the
> package cache again quickly triggered it.

You could try setting the internal NIC to half-duplex, or perhaps use a
(passive) 10BASE-T hub instead of a switch if you cannot configure that
internally, on the kernel command line, or doing it in userland is too
late.


Kind regards,
     jer

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-20  9:24               ` Jeroen Roovers
@ 2018-04-21  0:22                 ` John David Anglin
  2018-04-22 19:17                   ` Frank Scheiner
  2018-04-22 19:17                 ` Frank Scheiner
  1 sibling, 1 reply; 15+ messages in thread
From: John David Anglin @ 2018-04-21  0:22 UTC (permalink / raw)
  To: Jeroen Roovers, Frank Scheiner
  Cc: Helge Deller, debian-hppa, linux-parisc, Dennis Clarke

On 2018-04-20 5:24 AM, Jeroen Roovers wrote:
>> But later this week I retried the 712/80 with the current Linux
>> kernel (4.15.x) and Debian userland and the issue hit me again,
>> although much later and despite the 100 Mbit network switch in
>> between. Looking at it I could see that the collision indicator was
>> active on the switch for the port used by the 712/80. I then
>> configured a singular port of the 1000 Mbit network switch to 10 Mbit
>> full duplex and attached the 712/80 to it. And then the issue again
>> seemed to be gone. But trying to install a package or updating the
>> package cache again quickly triggered it.
> You could try setting the internal NIC to half-duplex, or perhaps use a
> (passive) 10BASE-T hub instead of a switch if you cannot configure that
> internally, on the kernel command line, or doing it in userland is too
> late.
 From the manual, it seems the 10BASE-T port is half duplex (CSMA/CD).  
The MAU
interface is definitely half duplex and the word duplex is not mentioned 
in the manual.

The 10BASE-T port probably doesn't support auto negotiation, so you will 
need to manually
set the switch port to 10BASE-T half duplex if it doesn't automatically 
configure to this mode
when auto negotiation fails.

Some switches support a half-duplex back pressure form of flow control.

Setting the switch port is probably easier than finding a passive 10BASE 
hub.

Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-20  6:37               ` Helge Deller
@ 2018-04-21 19:12                 ` John David Anglin
  2018-04-21 22:17                   ` Helge Deller
  0 siblings, 1 reply; 15+ messages in thread
From: John David Anglin @ 2018-04-21 19:12 UTC (permalink / raw)
  To: Helge Deller, Frank Scheiner; +Cc: debian-hppa, linux-parisc, Dennis Clarke

On 2018-04-20 2:37 AM, Helge Deller wrote:
>> Also interesting, the kernel messages for 4.15.11, please notice the
>> time difference between "random: crng init done" and "Key type
>> asymmetric registered":
> Seems to be a generic issue.
> https://www.linuxquestions.org/questions/showthread.php?p=5803405#post5803405
>
> My assumption is, that the kernel waits until it has
> enough randomness for the various encryption algorithms.
I think this is caused by cryptomgr_test.  It can be disabled with 
"cryptomgr.notests" on command
line.

Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-21 19:12                 ` John David Anglin
@ 2018-04-21 22:17                   ` Helge Deller
  2018-04-21 22:36                     ` John David Anglin
  0 siblings, 1 reply; 15+ messages in thread
From: Helge Deller @ 2018-04-21 22:17 UTC (permalink / raw)
  To: John David Anglin, Frank Scheiner
  Cc: debian-hppa, linux-parisc, Dennis Clarke

On 21.04.2018 21:12, John David Anglin wrote:
> On 2018-04-20 2:37 AM, Helge Deller wrote:
>>> Also interesting, the kernel messages for 4.15.11, please notice the
>>> time difference between "random: crng init done" and "Key type
>>> asymmetric registered":
>> Seems to be a generic issue.
>> https://www.linuxquestions.org/questions/showthread.php?p=5803405#post5803405
>>
>> My assumption is, that the kernel waits until it has
>> enough randomness for the various encryption algorithms.

> I think this is caused by cryptomgr_test.
> It can be disabled with "cryptomgr.notests" on command line.

Did you tested this?
Unless I typed it wrong it didn't worked on my B160L:
[    0.000000] Kernel command line: root=/dev/sda5 crpytomgr.notests HOME=/ console=ttyS0 TERM=vt102 palo_kernel=2/vmlinux
...
[   15.549370] workingset: timestamp_bits=14 max_order=15 bucket_order=1
[   15.688261] zbud: loaded
[   57.608154] random: crng init done
...long delay here...
[  207.522038] Key type asymmetric registered
[  207.574154] Asymmetric key parser 'x509' registered
[  207.635883] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250)
[  207.729718] io scheduler noop registered

Helge

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-21 22:17                   ` Helge Deller
@ 2018-04-21 22:36                     ` John David Anglin
  2018-04-22  9:06                       ` Helge Deller
  0 siblings, 1 reply; 15+ messages in thread
From: John David Anglin @ 2018-04-21 22:36 UTC (permalink / raw)
  To: Helge Deller, Frank Scheiner; +Cc: debian-hppa, linux-parisc, Dennis Clarke

On 2018-04-21 6:17 PM, Helge Deller wrote:
>> It can be disabled with "cryptomgr.notests" on command line.
> Did you tested this?
Not recently.  I found this when I was working on the cache.TLB patch.  
It caused a stall in one version.
> Unless I typed it wrong it didn't worked on my B160L:
> [    0.000000] Kernel command line: root=/dev/sda5 crpytomgr.notests HOME=/ console=ttyS0 TERM=vt102 palo_kernel=2/vmlinux
You typed it wrong.

Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-21 22:36                     ` John David Anglin
@ 2018-04-22  9:06                       ` Helge Deller
  2018-04-22 19:17                         ` Frank Scheiner
  0 siblings, 1 reply; 15+ messages in thread
From: Helge Deller @ 2018-04-22  9:06 UTC (permalink / raw)
  To: John David Anglin, Frank Scheiner
  Cc: debian-hppa, linux-parisc, Dennis Clarke

On 22.04.2018 00:36, John David Anglin wrote:
> On 2018-04-21 6:17 PM, Helge Deller wrote:
>>> It can be disabled with "cryptomgr.notests" on command line.
>> Did you tested this?
> Not recently.  I found this when I was working on the cache.TLB patch.  It caused a stall in one version.
>> Unless I typed it wrong it didn't worked on my B160L:
>> [    0.000000] Kernel command line: root=/dev/sda5 crpytomgr.notests HOME=/ console=ttyS0 TERM=vt102 palo_kernel=2/vmlinux

> You typed it wrong.

Yes, my fault.
"cryptomgr.notests" did worked as expected.
Thanks!
Helge
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-20  9:24               ` Jeroen Roovers
  2018-04-21  0:22                 ` John David Anglin
@ 2018-04-22 19:17                 ` Frank Scheiner
  1 sibling, 0 replies; 15+ messages in thread
From: Frank Scheiner @ 2018-04-22 19:17 UTC (permalink / raw)
  To: Jeroen Roovers; +Cc: Helge Deller, debian-hppa, linux-parisc, Dennis Clarke

On 04/20/2018 11:24 AM, Jeroen Roovers wrote:
> You could try setting the internal NIC to half-duplex, or perhaps use a
> (passive) 10BASE-T hub instead of a switch if you cannot configure that
> internally, on the kernel command line, or doing it in userland is too
> late.

I actually had the port configured to half-duplex at first. But I was 
distracted by the high number of collisions taking place and so changed 
it to full-duplex.

Cheers,
Frank

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-21  0:22                 ` John David Anglin
@ 2018-04-22 19:17                   ` Frank Scheiner
  2018-04-22 20:10                     ` John David Anglin
  2018-04-23 13:38                     ` Frank Scheiner
  0 siblings, 2 replies; 15+ messages in thread
From: Frank Scheiner @ 2018-04-22 19:17 UTC (permalink / raw)
  To: John David Anglin, Jeroen Roovers
  Cc: Helge Deller, debian-hppa, linux-parisc, Dennis Clarke

On 04/21/2018 02:22 AM, John David Anglin wrote:
>  From the manual, it seems the 10BASE-T port is half duplex (CSMA/CD). 
> The MAU
> interface is definitely half duplex and the word duplex is not mentioned 
> in the manual.

I also didn't find any info about half-/full-duplex in the two manuals I 
have at hand for the 712/80 ("Service Handbook" and "Technical Reference 
Manual"). To be sure, which one did you consult?

> 
> The 10BASE-T port probably doesn't support auto negotiation, so you will 
> need to manually
> set the switch port to 10BASE-T half duplex if it doesn't automatically 
> configure to this mode
> when auto negotiation fails.

Did this at first but then went for full-duplex again. Today I started 
with full-duplex and actively cooling the heatsink (now smoothed and 
with fresh thermal grease applied) of the 712/80's processor, but that 
didn't help alone. The issue hit me after entering the password during 
login.

Then I reconfigured half-duplex and tried again. The machine now worked 
through the whole login and I could also do an `apt update` without 
issues afterwards. Then I let it alone for about twenty minutes and on 
return I did an `apt list --upgradable` which triggered the issue again.

:-/

> 
> Some switches support a half-duplex back pressure form of flow control.

I'll try that now. According to the documentation my switch can create 
back-pressure as form of flow control.

Cheers,
Frank

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-22  9:06                       ` Helge Deller
@ 2018-04-22 19:17                         ` Frank Scheiner
  0 siblings, 0 replies; 15+ messages in thread
From: Frank Scheiner @ 2018-04-22 19:17 UTC (permalink / raw)
  To: Helge Deller, John David Anglin; +Cc: debian-hppa, linux-parisc, Dennis Clarke

On 04/22/2018 11:06 AM, Helge Deller wrote:
> On 22.04.2018 00:36, John David Anglin wrote:
>> On 2018-04-21 6:17 PM, Helge Deller wrote:
>>>> It can be disabled with "cryptomgr.notests" on command line.
>>> Did you tested this?
>> Not recently.  I found this when I was working on the cache.TLB patch.  It caused a stall in one version.
>>> Unless I typed it wrong it didn't worked on my B160L:
>>> [    0.000000] Kernel command line: root=/dev/sda5 crpytomgr.notests HOME=/ console=ttyS0 TERM=vt102 palo_kernel=2/vmlinux
> 
>> You typed it wrong.
> 
> Yes, my fault.
> "cryptomgr.notests" did worked as expected.

Great, that's pretty useful for slower machines like the 712/80.

Cheers,
Frank

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-22 19:17                   ` Frank Scheiner
@ 2018-04-22 20:10                     ` John David Anglin
  2018-04-23 13:38                     ` Frank Scheiner
  1 sibling, 0 replies; 15+ messages in thread
From: John David Anglin @ 2018-04-22 20:10 UTC (permalink / raw)
  To: Frank Scheiner, Jeroen Roovers
  Cc: Helge Deller, debian-hppa, linux-parisc, Dennis Clarke

On 2018-04-22 3:17 PM, Frank Scheiner wrote:
> On 04/21/2018 02:22 AM, John David Anglin wrote:
>>  From the manual, it seems the 10BASE-T port is half duplex 
>> (CSMA/CD). The MAU
>> interface is definitely half duplex and the word duplex is not 
>> mentioned in the manual.
>
> I also didn't find any info about half-/full-duplex in the two manuals 
> I have at hand for the 712/80 ("Service Handbook" and "Technical 
> Reference Manual"). To be sure, which one did you consult?
I looked at the "Technical Reference".
>
>>
>> The 10BASE-T port probably doesn't support auto negotiation, so you 
>> will need to manually
>> set the switch port to 10BASE-T half duplex if it doesn't 
>> automatically configure to this mode
>> when auto negotiation fails.
>
> Did this at first but then went for full-duplex again. Today I started 
> with full-duplex and actively cooling the heatsink (now smoothed and 
> with fresh thermal grease applied) of the 712/80's processor, but that 
> didn't help alone. The issue hit me after entering the password during 
> login.
>
> Then I reconfigured half-duplex and tried again. The machine now 
> worked through the whole login and I could also do an `apt update` 
> without issues afterwards. Then I let it alone for about twenty 
> minutes and on return I did an `apt list --upgradable` which triggered 
> the issue again.
Seems like hardware problem, probably in 712.  The switch and 712 need 
to be in same mode.  If my supposition about the 712
only supporting half duplex is correct, then the switch will have to be 
in half duplex.  I think network boot and `apt update`
would be a sufficient test of the network configuration.  Without error 
messages, this is hard.
>
> :-/
>
>>
>> Some switches support a half-duplex back pressure form of flow control.
>
> I'll try that now. According to the documentation my switch can create 
> back-pressure as form of flow control.
It's possible flow control on the server port might help given that the 
712 is so slow and probably
needs half duplex.  The switch might drop packets as a result. However, 
IP usually adjusts for slow segments.

Dave

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Ultra5 successful install - PGX64 issues
  2018-04-22 19:17                   ` Frank Scheiner
  2018-04-22 20:10                     ` John David Anglin
@ 2018-04-23 13:38                     ` Frank Scheiner
  1 sibling, 0 replies; 15+ messages in thread
From: Frank Scheiner @ 2018-04-23 13:38 UTC (permalink / raw)
  To: John David Anglin, Jeroen Roovers; +Cc: Helge Deller, debian-hppa, linux-parisc



On 04/22/2018 09:17 PM, Frank Scheiner wrote:
>> Some switches support a half-duplex back pressure form of flow control.
> 
> I'll try that now. According to the documentation my switch can create 
> back-pressure as form of flow control.

Yesterday after I activated flow control on the switch, the 712/80 got 
back after a while and finished the `apt list --upgradable` command with 
output - in between the journald of systemd crashed and restarted. 
Reissuing the same `apt [...]` command worked without problems. On the 
switch's port summary I could now also recognize that the host that acts 
as NFS server now got pause frames submitted by the switch - so the flow 
control is working.

I then tried to install `joe` and when `update-alternatives` started it 
again lost the connection to the NFS server. :-( It didn't recover from 
that - at least not during the time I waited for it - so I powered the 
712/80 down.

I thought maybe switching back to System V init might ease the load a 
little bit for the 712/80, so I upgraded the file system with a c8000 
(incl. newer patch level for the kernel) and removed systemd afterwards 
(also from initramfs).

I then ran some benchmarks without any issues in between.

Today I still have the problems described in [1] when doing `apt install 
[...]` or `apt remove [...]` but now the 712/80 recovered each time so 
far after a while, so it looks like an improvement to me. Look at the 
timings for `apt remove [...]`:

```
root@hp-712:~# time apt remove -y joe
[ 8794.150750] nfs: server 172.16.0.2 not responding, still trying
[...]
[ 8794.962227] nfs: server 172.16.0.2 not responding, still trying
[ 8795.074226] nfs: server 172.16.0.2 OK
[...]
[ 8797.271834] nfs: server 172.16.0.2 OK
[ 8802.242312] nfs: server 172.16.0.2 not responding, still trying
[...]
[ 9235.937478] nfs: server 172.16.0.2 OK
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
   joe
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
After this operation, 2,086 kB disk space will be freed.
(Reading database ... 41128 files and directories currently installed.)
Removing joe (4.6-1) ...
update-alternatives: using /usr/bin/jmacs to provide /usr/bin/editor 
(editor) in auto mode
update-alternatives: using /usr/bin/jpico to provide /usr/bin/editor 
(editor) in auto mode
update-alternatives: using /bin/nano to provide /usr/bin/editor (editor) 
in auto mode
Processing triggers for mime-support (3.60) ...
[ 9357.992385] nfs: server 172.16.0.2 not responding, still trying
[...]
[10055.370493] nfs: server 172.16.0.2 not responding, still trying
[10055.709731] nfs: server 172.16.0.2 OK
[...]
[10057.212469] nfs: server 172.16.0.2 OK

real    22m0.853s
user    1m3.264s
sys     0m43.875s
```

...the `apt install -y joe` done beforehand took about 41 minutes. So 
the 712/80 can recover from the described problems, but package 
management should really be done from a more powerful machine, at least 
when running diskless.

[1]: https://lists.debian.org/debian-hppa/2018/04/msg00007.html

Cheers,
Frank

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Ultra5 successful install - PGX64 issues
@ 2018-04-08  9:52 Phillip Stevens
  0 siblings, 0 replies; 15+ messages in thread
From: Phillip Stevens @ 2018-04-08  9:52 UTC (permalink / raw)
  To: sparclinux

With reference to my post on the debian-sparc list,
the advice was to request help here.

I used the April 4 ISO to successfully migrate my Ultra5 to Sparc64.
It should show up in popcon.

However, there remains issues in getting an X desktop running.
I'm looking for any hints on getting a PGX64 ATI card running on this
platform, please.

I think the mach64 module is the right way to proceed, but there are
references to the same error I'm facing, going back over 10 years.

Notes.

The radeon driver causes an error unless modeset is disabled, by this
configuration.
In: /etc/modprobe.d/local.conf
options libata dma=0
options radeon modeset=0

The xserver-xorg-video-mach64 package is not installed by default, yet
it seems to be required for the ATI video hardware.

Another module that is not available in sparc64 is the
xserver-xorg-video-sunffb package.
https://packages.debian.org/wheezy/sparc/xserver-xorg-video-sunffb/download
It was last available for wheezy.

There are several references to the failure of mach64 module to map
mmio aperture arising from the implementation of security mode in the
kernel.
In: /var/log/Xorg.0.log
[    84.251] (II) MACH64(0): Creating default Display subsection in
Screen section "Default Screen Section" for depth/fbbpp 24/32
[    84.251] (=) MACH64(0): Depth 24, (--) framebuffer bpp 32
[    84.252] (=) MACH64(0): Using XAA acceleration architecture
[    84.252] (EE) Unable to map mmio aperture. Invalid argument (22)
[    84.252] (WW) MACH64: Mach64 in slot 2:1:0 could not be detected!
[    84.252] (II) UnloadModule: "mach64"
[    84.253] (EE) Screen(s) found, but none have a usable configuration.

Regards, Phillip

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-04-23 13:38 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAFC0-0XVQUVHmM+TT4qUK_djRGx-eV+d=5FTpGyjRR40mwAv=Q@mail.gmail.com>
     [not found] ` <6a4e3490-dd3e-5832-43be-dba8211ce6e4@physik.fu-berlin.de>
     [not found]   ` <dd60b1eb-595d-f208-2885-c958a5ff3a4c@blastwave.org>
     [not found]     ` <16600fdc-6bba-1e2d-f106-560b8ea366c8@physik.fu-berlin.de>
     [not found]       ` <9671ecea-1353-df1c-ebbd-24a5b7d3008b@oetec.com>
     [not found]         ` <de1ac156-926a-c187-f15e-2c3da9251c82@web.de>
2018-04-15  8:34           ` Ultra5 successful install - PGX64 issues Helge Deller
2018-04-19 19:29             ` Frank Scheiner
2018-04-20  6:37               ` Helge Deller
2018-04-21 19:12                 ` John David Anglin
2018-04-21 22:17                   ` Helge Deller
2018-04-21 22:36                     ` John David Anglin
2018-04-22  9:06                       ` Helge Deller
2018-04-22 19:17                         ` Frank Scheiner
2018-04-20  9:24               ` Jeroen Roovers
2018-04-21  0:22                 ` John David Anglin
2018-04-22 19:17                   ` Frank Scheiner
2018-04-22 20:10                     ` John David Anglin
2018-04-23 13:38                     ` Frank Scheiner
2018-04-22 19:17                 ` Frank Scheiner
2018-04-08  9:52 Phillip Stevens

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.