* [PATCH v2 0/5] Use more asm-generic headers
@ 2012-08-05 1:23 Rob Herring
2012-08-05 1:23 ` [PATCH v2 1/5] ARM: use generic version of identical asm headers Rob Herring
` (6 more replies)
0 siblings, 7 replies; 46+ messages in thread
From: Rob Herring @ 2012-08-05 1:23 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
Inspired by the AArgh64 claim that it should be separate from ARM and one
reason was being able to use more asm-generic headers. Doing a diff of
arch/arm/include/asm and include/asm-generic there are numerous asm
headers which are functionally identical to their asm-generic counterparts
and others which can use more of the generic versions.
V2 separates unaligned.h and termios.h changes to separate patches. Also,
it adds a fix for build failures on non-gzip decompressor compile.
Rob
Rob Herring (5):
ARM: use generic version of identical asm headers
ARM: add strstr declaration for decompressors
ARM: use generic unaligned.h
ARM: use generic termios.h
ARM: convert asm/irqflags.h to use asm-generic/irqflags.h
arch/arm/boot/compressed/decompress.c | 3 +
arch/arm/include/asm/Kbuild | 17 +++
arch/arm/include/asm/current.h | 15 ---
arch/arm/include/asm/exec.h | 6 -
arch/arm/include/asm/ipcbuf.h | 1 -
arch/arm/include/asm/irqflags.h | 131 ++++++++--------------
arch/arm/include/asm/msgbuf.h | 31 ------
arch/arm/include/asm/param.h | 31 ------
arch/arm/include/asm/parport.h | 18 ---
arch/arm/include/asm/segment.h | 11 --
arch/arm/include/asm/sembuf.h | 25 -----
arch/arm/include/asm/serial.h | 19 ----
arch/arm/include/asm/shmbuf.h | 42 -------
arch/arm/include/asm/socket.h | 72 ------------
arch/arm/include/asm/sockios.h | 13 ---
arch/arm/include/asm/termbits.h | 198 ---------------------------------
arch/arm/include/asm/termios.h | 92 ---------------
arch/arm/include/asm/timex.h | 6 +-
arch/arm/include/asm/types.h | 16 ---
arch/arm/include/asm/unaligned.h | 19 ----
20 files changed, 67 insertions(+), 699 deletions(-)
delete mode 100644 arch/arm/include/asm/current.h
delete mode 100644 arch/arm/include/asm/exec.h
delete mode 100644 arch/arm/include/asm/ipcbuf.h
delete mode 100644 arch/arm/include/asm/msgbuf.h
delete mode 100644 arch/arm/include/asm/param.h
delete mode 100644 arch/arm/include/asm/parport.h
delete mode 100644 arch/arm/include/asm/segment.h
delete mode 100644 arch/arm/include/asm/sembuf.h
delete mode 100644 arch/arm/include/asm/serial.h
delete mode 100644 arch/arm/include/asm/shmbuf.h
delete mode 100644 arch/arm/include/asm/socket.h
delete mode 100644 arch/arm/include/asm/sockios.h
delete mode 100644 arch/arm/include/asm/termbits.h
delete mode 100644 arch/arm/include/asm/termios.h
delete mode 100644 arch/arm/include/asm/types.h
delete mode 100644 arch/arm/include/asm/unaligned.h
--
1.7.9.5
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 1/5] ARM: use generic version of identical asm headers
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
@ 2012-08-05 1:23 ` Rob Herring
2012-08-05 1:23 ` [PATCH v2 2/5] ARM: add strstr declaration for decompressors Rob Herring
` (5 subsequent siblings)
6 siblings, 0 replies; 46+ messages in thread
From: Rob Herring @ 2012-08-05 1:23 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
Inspired by the AArgh64 claim that it should be separate from ARM and one
reason was being able to use more asm-generic headers. Doing a diff of
arch/arm/include/asm and include/asm-generic there are numerous asm
headers which are functionally identical to their asm-generic counterparts.
Delete the ARM version and use the generic ones.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/Kbuild | 15 +++
arch/arm/include/asm/current.h | 15 ---
arch/arm/include/asm/exec.h | 6 --
arch/arm/include/asm/ipcbuf.h | 1 -
arch/arm/include/asm/msgbuf.h | 31 ------
arch/arm/include/asm/param.h | 31 ------
arch/arm/include/asm/parport.h | 18 ----
arch/arm/include/asm/segment.h | 11 ---
arch/arm/include/asm/sembuf.h | 25 -----
arch/arm/include/asm/serial.h | 19 ----
arch/arm/include/asm/shmbuf.h | 42 ---------
arch/arm/include/asm/socket.h | 72 --------------
arch/arm/include/asm/sockios.h | 13 ---
arch/arm/include/asm/termbits.h | 198 ---------------------------------------
arch/arm/include/asm/timex.h | 6 +-
arch/arm/include/asm/types.h | 16 ----
16 files changed, 17 insertions(+), 502 deletions(-)
delete mode 100644 arch/arm/include/asm/current.h
delete mode 100644 arch/arm/include/asm/exec.h
delete mode 100644 arch/arm/include/asm/ipcbuf.h
delete mode 100644 arch/arm/include/asm/msgbuf.h
delete mode 100644 arch/arm/include/asm/param.h
delete mode 100644 arch/arm/include/asm/parport.h
delete mode 100644 arch/arm/include/asm/segment.h
delete mode 100644 arch/arm/include/asm/sembuf.h
delete mode 100644 arch/arm/include/asm/serial.h
delete mode 100644 arch/arm/include/asm/shmbuf.h
delete mode 100644 arch/arm/include/asm/socket.h
delete mode 100644 arch/arm/include/asm/sockios.h
delete mode 100644 arch/arm/include/asm/termbits.h
delete mode 100644 arch/arm/include/asm/types.h
diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
index 960abce..6cc8a13 100644
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@ -5,16 +5,31 @@ header-y += hwcap.h
generic-y += auxvec.h
generic-y += bitsperlong.h
generic-y += cputime.h
+generic-y += current.h
generic-y += emergency-restart.h
generic-y += errno.h
+generic-y += exec.h
generic-y += ioctl.h
+generic-y += ipcbuf.h
generic-y += irq_regs.h
generic-y += kdebug.h
generic-y += local.h
generic-y += local64.h
+generic-y += msgbuf.h
+generic-y += param.h
+generic-y += parport.h
generic-y += percpu.h
generic-y += poll.h
generic-y += resource.h
generic-y += sections.h
+generic-y += segment.h
+generic-y += sembuf.h
+generic-y += serial.h
+generic-y += shmbuf.h
generic-y += siginfo.h
generic-y += sizes.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += termbits.h
+generic-y += timex.h
+generic-y += types.h
diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
deleted file mode 100644
index 75d21e2..0000000
--- a/arch/arm/include/asm/current.h
+++ /dev/null
@@ -1,15 +0,0 @@
-#ifndef _ASMARM_CURRENT_H
-#define _ASMARM_CURRENT_H
-
-#include <linux/thread_info.h>
-
-static inline struct task_struct *get_current(void) __attribute_const__;
-
-static inline struct task_struct *get_current(void)
-{
- return current_thread_info()->task;
-}
-
-#define current (get_current())
-
-#endif /* _ASMARM_CURRENT_H */
diff --git a/arch/arm/include/asm/exec.h b/arch/arm/include/asm/exec.h
deleted file mode 100644
index 7c4fbef..0000000
--- a/arch/arm/include/asm/exec.h
+++ /dev/null
@@ -1,6 +0,0 @@
-#ifndef __ASM_ARM_EXEC_H
-#define __ASM_ARM_EXEC_H
-
-#define arch_align_stack(x) (x)
-
-#endif /* __ASM_ARM_EXEC_H */
diff --git a/arch/arm/include/asm/ipcbuf.h b/arch/arm/include/asm/ipcbuf.h
deleted file mode 100644
index 84c7e51..0000000
--- a/arch/arm/include/asm/ipcbuf.h
+++ /dev/null
@@ -1 +0,0 @@
-#include <asm-generic/ipcbuf.h>
diff --git a/arch/arm/include/asm/msgbuf.h b/arch/arm/include/asm/msgbuf.h
deleted file mode 100644
index 33b35b9..0000000
--- a/arch/arm/include/asm/msgbuf.h
+++ /dev/null
@@ -1,31 +0,0 @@
-#ifndef _ASMARM_MSGBUF_H
-#define _ASMARM_MSGBUF_H
-
-/*
- * The msqid64_ds structure for arm architecture.
- * Note extra padding because this structure is passed back and forth
- * between kernel and user space.
- *
- * Pad space is left for:
- * - 64-bit time_t to solve y2038 problem
- * - 2 miscellaneous 32-bit values
- */
-
-struct msqid64_ds {
- struct ipc64_perm msg_perm;
- __kernel_time_t msg_stime; /* last msgsnd time */
- unsigned long __unused1;
- __kernel_time_t msg_rtime; /* last msgrcv time */
- unsigned long __unused2;
- __kernel_time_t msg_ctime; /* last change time */
- unsigned long __unused3;
- unsigned long msg_cbytes; /* current number of bytes on queue */
- unsigned long msg_qnum; /* number of messages in queue */
- unsigned long msg_qbytes; /* max number of bytes on queue */
- __kernel_pid_t msg_lspid; /* pid of last msgsnd */
- __kernel_pid_t msg_lrpid; /* last receive pid */
- unsigned long __unused4;
- unsigned long __unused5;
-};
-
-#endif /* _ASMARM_MSGBUF_H */
diff --git a/arch/arm/include/asm/param.h b/arch/arm/include/asm/param.h
deleted file mode 100644
index 8b24bf94..0000000
--- a/arch/arm/include/asm/param.h
+++ /dev/null
@@ -1,31 +0,0 @@
-/*
- * arch/arm/include/asm/param.h
- *
- * Copyright (C) 1995-1999 Russell King
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-#ifndef __ASM_PARAM_H
-#define __ASM_PARAM_H
-
-#ifdef __KERNEL__
-# define HZ CONFIG_HZ /* Internal kernel timer frequency */
-# define USER_HZ 100 /* User interfaces are in "ticks" */
-# define CLOCKS_PER_SEC (USER_HZ) /* like times() */
-#else
-# define HZ 100
-#endif
-
-#define EXEC_PAGESIZE 4096
-
-#ifndef NOGROUP
-#define NOGROUP (-1)
-#endif
-
-/* max length of hostname */
-#define MAXHOSTNAMELEN 64
-
-#endif
-
diff --git a/arch/arm/include/asm/parport.h b/arch/arm/include/asm/parport.h
deleted file mode 100644
index 26e94b0..0000000
--- a/arch/arm/include/asm/parport.h
+++ /dev/null
@@ -1,18 +0,0 @@
-/*
- * arch/arm/include/asm/parport.h: ARM-specific parport initialisation
- *
- * Copyright (C) 1999, 2000 Tim Waugh <tim@cyberelk.demon.co.uk>
- *
- * This file should only be included by drivers/parport/parport_pc.c.
- */
-
-#ifndef __ASMARM_PARPORT_H
-#define __ASMARM_PARPORT_H
-
-static int __devinit parport_pc_find_isa_ports (int autoirq, int autodma);
-static int __devinit parport_pc_find_nonpci_ports (int autoirq, int autodma)
-{
- return parport_pc_find_isa_ports (autoirq, autodma);
-}
-
-#endif /* !(_ASMARM_PARPORT_H) */
diff --git a/arch/arm/include/asm/segment.h b/arch/arm/include/asm/segment.h
deleted file mode 100644
index 9e24c21..0000000
--- a/arch/arm/include/asm/segment.h
+++ /dev/null
@@ -1,11 +0,0 @@
-#ifndef __ASM_ARM_SEGMENT_H
-#define __ASM_ARM_SEGMENT_H
-
-#define __KERNEL_CS 0x0
-#define __KERNEL_DS 0x0
-
-#define __USER_CS 0x1
-#define __USER_DS 0x1
-
-#endif /* __ASM_ARM_SEGMENT_H */
-
diff --git a/arch/arm/include/asm/sembuf.h b/arch/arm/include/asm/sembuf.h
deleted file mode 100644
index 1c02839..0000000
--- a/arch/arm/include/asm/sembuf.h
+++ /dev/null
@@ -1,25 +0,0 @@
-#ifndef _ASMARM_SEMBUF_H
-#define _ASMARM_SEMBUF_H
-
-/*
- * The semid64_ds structure for arm architecture.
- * Note extra padding because this structure is passed back and forth
- * between kernel and user space.
- *
- * Pad space is left for:
- * - 64-bit time_t to solve y2038 problem
- * - 2 miscellaneous 32-bit values
- */
-
-struct semid64_ds {
- struct ipc64_perm sem_perm; /* permissions .. see ipc.h */
- __kernel_time_t sem_otime; /* last semop time */
- unsigned long __unused1;
- __kernel_time_t sem_ctime; /* last change time */
- unsigned long __unused2;
- unsigned long sem_nsems; /* no. of semaphores in array */
- unsigned long __unused3;
- unsigned long __unused4;
-};
-
-#endif /* _ASMARM_SEMBUF_H */
diff --git a/arch/arm/include/asm/serial.h b/arch/arm/include/asm/serial.h
deleted file mode 100644
index ebb0490..0000000
--- a/arch/arm/include/asm/serial.h
+++ /dev/null
@@ -1,19 +0,0 @@
-/*
- * arch/arm/include/asm/serial.h
- *
- * Copyright (C) 1996 Russell King.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Changelog:
- * 15-10-1996 RMK Created
- */
-
-#ifndef __ASM_SERIAL_H
-#define __ASM_SERIAL_H
-
-#define BASE_BAUD (1843200 / 16)
-
-#endif
diff --git a/arch/arm/include/asm/shmbuf.h b/arch/arm/include/asm/shmbuf.h
deleted file mode 100644
index 2e5c67b..0000000
--- a/arch/arm/include/asm/shmbuf.h
+++ /dev/null
@@ -1,42 +0,0 @@
-#ifndef _ASMARM_SHMBUF_H
-#define _ASMARM_SHMBUF_H
-
-/*
- * The shmid64_ds structure for arm architecture.
- * Note extra padding because this structure is passed back and forth
- * between kernel and user space.
- *
- * Pad space is left for:
- * - 64-bit time_t to solve y2038 problem
- * - 2 miscellaneous 32-bit values
- */
-
-struct shmid64_ds {
- struct ipc64_perm shm_perm; /* operation perms */
- size_t shm_segsz; /* size of segment (bytes) */
- __kernel_time_t shm_atime; /* last attach time */
- unsigned long __unused1;
- __kernel_time_t shm_dtime; /* last detach time */
- unsigned long __unused2;
- __kernel_time_t shm_ctime; /* last change time */
- unsigned long __unused3;
- __kernel_pid_t shm_cpid; /* pid of creator */
- __kernel_pid_t shm_lpid; /* pid of last operator */
- unsigned long shm_nattch; /* no. of current attaches */
- unsigned long __unused4;
- unsigned long __unused5;
-};
-
-struct shminfo64 {
- unsigned long shmmax;
- unsigned long shmmin;
- unsigned long shmmni;
- unsigned long shmseg;
- unsigned long shmall;
- unsigned long __unused1;
- unsigned long __unused2;
- unsigned long __unused3;
- unsigned long __unused4;
-};
-
-#endif /* _ASMARM_SHMBUF_H */
diff --git a/arch/arm/include/asm/socket.h b/arch/arm/include/asm/socket.h
deleted file mode 100644
index 6433cad..0000000
--- a/arch/arm/include/asm/socket.h
+++ /dev/null
@@ -1,72 +0,0 @@
-#ifndef _ASMARM_SOCKET_H
-#define _ASMARM_SOCKET_H
-
-#include <asm/sockios.h>
-
-/* For setsockopt(2) */
-#define SOL_SOCKET 1
-
-#define SO_DEBUG 1
-#define SO_REUSEADDR 2
-#define SO_TYPE 3
-#define SO_ERROR 4
-#define SO_DONTROUTE 5
-#define SO_BROADCAST 6
-#define SO_SNDBUF 7
-#define SO_RCVBUF 8
-#define SO_SNDBUFFORCE 32
-#define SO_RCVBUFFORCE 33
-#define SO_KEEPALIVE 9
-#define SO_OOBINLINE 10
-#define SO_NO_CHECK 11
-#define SO_PRIORITY 12
-#define SO_LINGER 13
-#define SO_BSDCOMPAT 14
-/* To add :#define SO_REUSEPORT 15 */
-#define SO_PASSCRED 16
-#define SO_PEERCRED 17
-#define SO_RCVLOWAT 18
-#define SO_SNDLOWAT 19
-#define SO_RCVTIMEO 20
-#define SO_SNDTIMEO 21
-
-/* Security levels - as per NRL IPv6 - don't actually do anything */
-#define SO_SECURITY_AUTHENTICATION 22
-#define SO_SECURITY_ENCRYPTION_TRANSPORT 23
-#define SO_SECURITY_ENCRYPTION_NETWORK 24
-
-#define SO_BINDTODEVICE 25
-
-/* Socket filtering */
-#define SO_ATTACH_FILTER 26
-#define SO_DETACH_FILTER 27
-
-#define SO_PEERNAME 28
-#define SO_TIMESTAMP 29
-#define SCM_TIMESTAMP SO_TIMESTAMP
-
-#define SO_ACCEPTCONN 30
-
-#define SO_PEERSEC 31
-#define SO_PASSSEC 34
-#define SO_TIMESTAMPNS 35
-#define SCM_TIMESTAMPNS SO_TIMESTAMPNS
-
-#define SO_MARK 36
-
-#define SO_TIMESTAMPING 37
-#define SCM_TIMESTAMPING SO_TIMESTAMPING
-
-#define SO_PROTOCOL 38
-#define SO_DOMAIN 39
-
-#define SO_RXQ_OVFL 40
-
-#define SO_WIFI_STATUS 41
-#define SCM_WIFI_STATUS SO_WIFI_STATUS
-#define SO_PEEK_OFF 42
-
-/* Instruct lower device to use last 4-bytes of skb data as FCS */
-#define SO_NOFCS 43
-
-#endif /* _ASM_SOCKET_H */
diff --git a/arch/arm/include/asm/sockios.h b/arch/arm/include/asm/sockios.h
deleted file mode 100644
index a2588a2..0000000
--- a/arch/arm/include/asm/sockios.h
+++ /dev/null
@@ -1,13 +0,0 @@
-#ifndef __ARCH_ARM_SOCKIOS_H
-#define __ARCH_ARM_SOCKIOS_H
-
-/* Socket-level I/O control calls. */
-#define FIOSETOWN 0x8901
-#define SIOCSPGRP 0x8902
-#define FIOGETOWN 0x8903
-#define SIOCGPGRP 0x8904
-#define SIOCATMARK 0x8905
-#define SIOCGSTAMP 0x8906 /* Get stamp (timeval) */
-#define SIOCGSTAMPNS 0x8907 /* Get stamp (timespec) */
-
-#endif
diff --git a/arch/arm/include/asm/termbits.h b/arch/arm/include/asm/termbits.h
deleted file mode 100644
index 704135d..0000000
--- a/arch/arm/include/asm/termbits.h
+++ /dev/null
@@ -1,198 +0,0 @@
-#ifndef __ASM_ARM_TERMBITS_H
-#define __ASM_ARM_TERMBITS_H
-
-typedef unsigned char cc_t;
-typedef unsigned int speed_t;
-typedef unsigned int tcflag_t;
-
-#define NCCS 19
-struct termios {
- tcflag_t c_iflag; /* input mode flags */
- tcflag_t c_oflag; /* output mode flags */
- tcflag_t c_cflag; /* control mode flags */
- tcflag_t c_lflag; /* local mode flags */
- cc_t c_line; /* line discipline */
- cc_t c_cc[NCCS]; /* control characters */
-};
-
-struct termios2 {
- tcflag_t c_iflag; /* input mode flags */
- tcflag_t c_oflag; /* output mode flags */
- tcflag_t c_cflag; /* control mode flags */
- tcflag_t c_lflag; /* local mode flags */
- cc_t c_line; /* line discipline */
- cc_t c_cc[NCCS]; /* control characters */
- speed_t c_ispeed; /* input speed */
- speed_t c_ospeed; /* output speed */
-};
-
-struct ktermios {
- tcflag_t c_iflag; /* input mode flags */
- tcflag_t c_oflag; /* output mode flags */
- tcflag_t c_cflag; /* control mode flags */
- tcflag_t c_lflag; /* local mode flags */
- cc_t c_line; /* line discipline */
- cc_t c_cc[NCCS]; /* control characters */
- speed_t c_ispeed; /* input speed */
- speed_t c_ospeed; /* output speed */
-};
-
-
-/* c_cc characters */
-#define VINTR 0
-#define VQUIT 1
-#define VERASE 2
-#define VKILL 3
-#define VEOF 4
-#define VTIME 5
-#define VMIN 6
-#define VSWTC 7
-#define VSTART 8
-#define VSTOP 9
-#define VSUSP 10
-#define VEOL 11
-#define VREPRINT 12
-#define VDISCARD 13
-#define VWERASE 14
-#define VLNEXT 15
-#define VEOL2 16
-
-/* c_iflag bits */
-#define IGNBRK 0000001
-#define BRKINT 0000002
-#define IGNPAR 0000004
-#define PARMRK 0000010
-#define INPCK 0000020
-#define ISTRIP 0000040
-#define INLCR 0000100
-#define IGNCR 0000200
-#define ICRNL 0000400
-#define IUCLC 0001000
-#define IXON 0002000
-#define IXANY 0004000
-#define IXOFF 0010000
-#define IMAXBEL 0020000
-#define IUTF8 0040000
-
-/* c_oflag bits */
-#define OPOST 0000001
-#define OLCUC 0000002
-#define ONLCR 0000004
-#define OCRNL 0000010
-#define ONOCR 0000020
-#define ONLRET 0000040
-#define OFILL 0000100
-#define OFDEL 0000200
-#define NLDLY 0000400
-#define NL0 0000000
-#define NL1 0000400
-#define CRDLY 0003000
-#define CR0 0000000
-#define CR1 0001000
-#define CR2 0002000
-#define CR3 0003000
-#define TABDLY 0014000
-#define TAB0 0000000
-#define TAB1 0004000
-#define TAB2 0010000
-#define TAB3 0014000
-#define XTABS 0014000
-#define BSDLY 0020000
-#define BS0 0000000
-#define BS1 0020000
-#define VTDLY 0040000
-#define VT0 0000000
-#define VT1 0040000
-#define FFDLY 0100000
-#define FF0 0000000
-#define FF1 0100000
-
-/* c_cflag bit meaning */
-#define CBAUD 0010017
-#define B0 0000000 /* hang up */
-#define B50 0000001
-#define B75 0000002
-#define B110 0000003
-#define B134 0000004
-#define B150 0000005
-#define B200 0000006
-#define B300 0000007
-#define B600 0000010
-#define B1200 0000011
-#define B1800 0000012
-#define B2400 0000013
-#define B4800 0000014
-#define B9600 0000015
-#define B19200 0000016
-#define B38400 0000017
-#define EXTA B19200
-#define EXTB B38400
-#define CSIZE 0000060
-#define CS5 0000000
-#define CS6 0000020
-#define CS7 0000040
-#define CS8 0000060
-#define CSTOPB 0000100
-#define CREAD 0000200
-#define PARENB 0000400
-#define PARODD 0001000
-#define HUPCL 0002000
-#define CLOCAL 0004000
-#define CBAUDEX 0010000
-#define BOTHER 0010000
-#define B57600 0010001
-#define B115200 0010002
-#define B230400 0010003
-#define B460800 0010004
-#define B500000 0010005
-#define B576000 0010006
-#define B921600 0010007
-#define B1000000 0010010
-#define B1152000 0010011
-#define B1500000 0010012
-#define B2000000 0010013
-#define B2500000 0010014
-#define B3000000 0010015
-#define B3500000 0010016
-#define B4000000 0010017
-#define CIBAUD 002003600000 /* input baud rate */
-#define CMSPAR 010000000000 /* mark or space (stick) parity */
-#define CRTSCTS 020000000000 /* flow control */
-
-#define IBSHIFT 16
-
-/* c_lflag bits */
-#define ISIG 0000001
-#define ICANON 0000002
-#define XCASE 0000004
-#define ECHO 0000010
-#define ECHOE 0000020
-#define ECHOK 0000040
-#define ECHONL 0000100
-#define NOFLSH 0000200
-#define TOSTOP 0000400
-#define ECHOCTL 0001000
-#define ECHOPRT 0002000
-#define ECHOKE 0004000
-#define FLUSHO 0010000
-#define PENDIN 0040000
-#define IEXTEN 0100000
-#define EXTPROC 0200000
-
-/* tcflow() and TCXONC use these */
-#define TCOOFF 0
-#define TCOON 1
-#define TCIOFF 2
-#define TCION 3
-
-/* tcflush() and TCFLSH use these */
-#define TCIFLUSH 0
-#define TCOFLUSH 1
-#define TCIOFLUSH 2
-
-/* tcsetattr uses these */
-#define TCSANOW 0
-#define TCSADRAIN 1
-#define TCSAFLUSH 2
-
-#endif /* __ASM_ARM_TERMBITS_H */
diff --git a/arch/arm/include/asm/timex.h b/arch/arm/include/asm/timex.h
index ce11944..5e71172 100644
--- a/arch/arm/include/asm/timex.h
+++ b/arch/arm/include/asm/timex.h
@@ -15,12 +15,10 @@
#include <asm/arch_timer.h>
#include <mach/timex.h>
-typedef unsigned long cycles_t;
-
#ifdef ARCH_HAS_READ_CURRENT_TIMER
#define get_cycles() ({ cycles_t c; read_current_timer(&c) ? 0 : c; })
-#else
-#define get_cycles() (0)
#endif
+#include <asm-generic/timex.h>
+
#endif
diff --git a/arch/arm/include/asm/types.h b/arch/arm/include/asm/types.h
deleted file mode 100644
index 28beab9..0000000
--- a/arch/arm/include/asm/types.h
+++ /dev/null
@@ -1,16 +0,0 @@
-#ifndef __ASM_ARM_TYPES_H
-#define __ASM_ARM_TYPES_H
-
-#include <asm-generic/int-ll64.h>
-
-/*
- * These aren't exported outside the kernel to avoid name space clashes
- */
-#ifdef __KERNEL__
-
-#define BITS_PER_LONG 32
-
-#endif /* __KERNEL__ */
-
-#endif
-
--
1.7.9.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH v2 2/5] ARM: add strstr declaration for decompressors
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
2012-08-05 1:23 ` [PATCH v2 1/5] ARM: use generic version of identical asm headers Rob Herring
@ 2012-08-05 1:23 ` Rob Herring
2012-08-05 1:23 ` [PATCH v2 3/5] ARM: use generic unaligned.h Rob Herring
` (4 subsequent siblings)
6 siblings, 0 replies; 46+ messages in thread
From: Rob Herring @ 2012-08-05 1:23 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
With the generic unaligned.h, more kernel headers get pulled in including
dynamic_debug.h which needs strstr. As it is not really used, we only need
a declaration here.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
arch/arm/boot/compressed/decompress.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm/boot/compressed/decompress.c b/arch/arm/boot/compressed/decompress.c
index f41b38c..9deb56a 100644
--- a/arch/arm/boot/compressed/decompress.c
+++ b/arch/arm/boot/compressed/decompress.c
@@ -32,6 +32,9 @@ extern void error(char *);
# define Tracecv(c,x)
#endif
+/* Not needed, but used in some headers pulled in by decompressors */
+extern char * strstr(const char * s1, const char *s2);
+
#ifdef CONFIG_KERNEL_GZIP
#include "../../../../lib/decompress_inflate.c"
#endif
--
1.7.9.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
2012-08-05 1:23 ` [PATCH v2 1/5] ARM: use generic version of identical asm headers Rob Herring
2012-08-05 1:23 ` [PATCH v2 2/5] ARM: add strstr declaration for decompressors Rob Herring
@ 2012-08-05 1:23 ` Rob Herring
2012-10-08 16:43 ` Shawn Guo
2012-08-05 1:23 ` [PATCH v2 4/5] ARM: use generic termios.h Rob Herring
` (3 subsequent siblings)
6 siblings, 1 reply; 46+ messages in thread
From: Rob Herring @ 2012-08-05 1:23 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
This moves ARM over to the asm-generic/unaligned.h header. This has the
benefit of better code generated especially for ARMv7 on gcc 4.7+
compilers.
As Arnd Bergmann, points out: The asm-generic version uses the "struct"
version for native-endian unaligned access and the "byteshift" version
for the opposite endianess. The current ARM version however uses the
"byteshift" implementation for both.
Thanks to Nicolas Pitre for the excellent analysis:
Test case:
int foo (int *x) { return get_unaligned(x); }
long long bar (long long *x) { return get_unaligned(x); }
With the current ARM version:
foo:
ldrb r3, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
ldrb r1, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
ldrb r2, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
mov r3, r3, asl #16 @ tmp154, MEM[(const u8 *)x_1(D) + 2B],
ldrb r0, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
orr r3, r3, r1, asl #8 @, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
orr r3, r3, r2 @ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
orr r0, r3, r0, asl #24 @,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
bx lr @
bar:
stmfd sp!, {r4, r5, r6, r7} @,
mov r2, #0 @ tmp184,
ldrb r5, [r0, #6] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
ldrb r4, [r0, #5] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
ldrb ip, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
ldrb r1, [r0, #4] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
mov r5, r5, asl #16 @ tmp175, MEM[(const u8 *)x_1(D) + 6B],
ldrb r7, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
orr r5, r5, r4, asl #8 @, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
ldrb r6, [r0, #7] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
orr r5, r5, r1 @ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
ldrb r4, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
mov ip, ip, asl #16 @ tmp188, MEM[(const u8 *)x_1(D) + 2B],
ldrb r1, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
orr ip, ip, r7, asl #8 @, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
orr r3, r5, r6, asl #24 @,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
orr ip, ip, r4 @ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
orr ip, ip, r1, asl #24 @, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
mov r1, r3 @,
orr r0, r2, ip @ tmp171, tmp184, tmp194
ldmfd sp!, {r4, r5, r6, r7}
bx lr
In both cases the code is slightly suboptimal. One may wonder why
wasting r2 with the constant 0 in the second case for example. And all
the mov's could be folded in subsequent orr's, etc.
Now with the asm-generic version:
foo:
ldr r0, [r0, #0] @ unaligned @,* x
bx lr @
bar:
mov r3, r0 @ x, x
ldr r0, [r0, #0] @ unaligned @,* x
ldr r1, [r3, #4] @ unaligned @,
bx lr @
This is way better of course, but only because this was compiled for
ARMv7. In this case the compiler knows that the hardware can do
unaligned word access. This isn't that obvious for foo(), but if we
remove the get_unaligned() from bar as follows:
long long bar (long long *x) {return *x; }
then the resulting code is:
bar:
ldmia r0, {r0, r1} @ x,,
bx lr @
So this proves that the presumed aligned vs unaligned cases does have
influence on the instructions the compiler may use and that the above
unaligned code results are not just an accident.
Still... this isn't fully conclusive without at least looking at the
resulting assembly fron a pre ARMv6 compilation. Let's see with an
ARMv5 target:
foo:
ldrb r3, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
ldrb r1, [r0, #1] @ zero_extendqisi2 @ tmp140,
ldrb r2, [r0, #2] @ zero_extendqisi2 @ tmp143,
ldrb r0, [r0, #3] @ zero_extendqisi2 @ tmp146,
orr r3, r3, r1, asl #8 @, tmp142, tmp139, tmp140,
orr r3, r3, r2, asl #16 @, tmp145, tmp142, tmp143,
orr r0, r3, r0, asl #24 @,, tmp145, tmp146,
bx lr @
bar:
stmfd sp!, {r4, r5, r6, r7} @,
ldrb r2, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
ldrb r7, [r0, #1] @ zero_extendqisi2 @ tmp140,
ldrb r3, [r0, #4] @ zero_extendqisi2 @ tmp149,
ldrb r6, [r0, #5] @ zero_extendqisi2 @ tmp150,
ldrb r5, [r0, #2] @ zero_extendqisi2 @ tmp143,
ldrb r4, [r0, #6] @ zero_extendqisi2 @ tmp153,
ldrb r1, [r0, #7] @ zero_extendqisi2 @ tmp156,
ldrb ip, [r0, #3] @ zero_extendqisi2 @ tmp146,
orr r2, r2, r7, asl #8 @, tmp142, tmp139, tmp140,
orr r3, r3, r6, asl #8 @, tmp152, tmp149, tmp150,
orr r2, r2, r5, asl #16 @, tmp145, tmp142, tmp143,
orr r3, r3, r4, asl #16 @, tmp155, tmp152, tmp153,
orr r0, r2, ip, asl #24 @,, tmp145, tmp146,
orr r1, r3, r1, asl #24 @,, tmp155, tmp156,
ldmfd sp!, {r4, r5, r6, r7}
bx lr
Compared to the initial results, this is really nicely optimized and I
couldn't do much better if I were to hand code it myself.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/Kbuild | 1 +
arch/arm/include/asm/unaligned.h | 19 -------------------
2 files changed, 1 insertion(+), 19 deletions(-)
delete mode 100644 arch/arm/include/asm/unaligned.h
diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
index 6cc8a13..a86cabf 100644
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@ -33,3 +33,4 @@ generic-y += sockios.h
generic-y += termbits.h
generic-y += timex.h
generic-y += types.h
+generic-y += unaligned.h
diff --git a/arch/arm/include/asm/unaligned.h b/arch/arm/include/asm/unaligned.h
deleted file mode 100644
index 44593a8..0000000
--- a/arch/arm/include/asm/unaligned.h
+++ /dev/null
@@ -1,19 +0,0 @@
-#ifndef _ASM_ARM_UNALIGNED_H
-#define _ASM_ARM_UNALIGNED_H
-
-#include <linux/unaligned/le_byteshift.h>
-#include <linux/unaligned/be_byteshift.h>
-#include <linux/unaligned/generic.h>
-
-/*
- * Select endianness
- */
-#ifndef __ARMEB__
-#define get_unaligned __get_unaligned_le
-#define put_unaligned __put_unaligned_le
-#else
-#define get_unaligned __get_unaligned_be
-#define put_unaligned __put_unaligned_be
-#endif
-
-#endif /* _ASM_ARM_UNALIGNED_H */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH v2 4/5] ARM: use generic termios.h
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
` (2 preceding siblings ...)
2012-08-05 1:23 ` [PATCH v2 3/5] ARM: use generic unaligned.h Rob Herring
@ 2012-08-05 1:23 ` Rob Herring
2012-08-05 1:24 ` [PATCH v2 5/5] ARM: convert asm/irqflags.h to use asm-generic/irqflags.h Rob Herring
` (2 subsequent siblings)
6 siblings, 0 replies; 46+ messages in thread
From: Rob Herring @ 2012-08-05 1:23 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
As pointed out by Arnd Bergmann, this fixes a couple of issues but will
increase code size:
The original macro user_termio_to_kernel_termios was not endian safe. It
used an unsigned short ptr to access the low bits in a 32-bit word.
Both user_termio_to_kernel_termios and kernel_termios_to_user_termio are
missing error checking on put_user/get_user and copy_to/from_user.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
---
arch/arm/include/asm/Kbuild | 1 +
arch/arm/include/asm/termios.h | 92 ----------------------------------------
2 files changed, 1 insertion(+), 92 deletions(-)
delete mode 100644 arch/arm/include/asm/termios.h
diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
index a86cabf..8a7196c 100644
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@ -31,6 +31,7 @@ generic-y += sizes.h
generic-y += socket.h
generic-y += sockios.h
generic-y += termbits.h
+generic-y += termios.h
generic-y += timex.h
generic-y += types.h
generic-y += unaligned.h
diff --git a/arch/arm/include/asm/termios.h b/arch/arm/include/asm/termios.h
deleted file mode 100644
index 293e3f1..0000000
--- a/arch/arm/include/asm/termios.h
+++ /dev/null
@@ -1,92 +0,0 @@
-#ifndef __ASM_ARM_TERMIOS_H
-#define __ASM_ARM_TERMIOS_H
-
-#include <asm/termbits.h>
-#include <asm/ioctls.h>
-
-struct winsize {
- unsigned short ws_row;
- unsigned short ws_col;
- unsigned short ws_xpixel;
- unsigned short ws_ypixel;
-};
-
-#define NCC 8
-struct termio {
- unsigned short c_iflag; /* input mode flags */
- unsigned short c_oflag; /* output mode flags */
- unsigned short c_cflag; /* control mode flags */
- unsigned short c_lflag; /* local mode flags */
- unsigned char c_line; /* line discipline */
- unsigned char c_cc[NCC]; /* control characters */
-};
-
-#ifdef __KERNEL__
-/* intr=^C quit=^| erase=del kill=^U
- eof=^D vtime=\0 vmin=\1 sxtc=\0
- start=^Q stop=^S susp=^Z eol=\0
- reprint=^R discard=^U werase=^W lnext=^V
- eol2=\0
-*/
-#define INIT_C_CC "\003\034\177\025\004\0\1\0\021\023\032\0\022\017\027\026\0"
-#endif
-
-/* modem lines */
-#define TIOCM_LE 0x001
-#define TIOCM_DTR 0x002
-#define TIOCM_RTS 0x004
-#define TIOCM_ST 0x008
-#define TIOCM_SR 0x010
-#define TIOCM_CTS 0x020
-#define TIOCM_CAR 0x040
-#define TIOCM_RNG 0x080
-#define TIOCM_DSR 0x100
-#define TIOCM_CD TIOCM_CAR
-#define TIOCM_RI TIOCM_RNG
-#define TIOCM_OUT1 0x2000
-#define TIOCM_OUT2 0x4000
-#define TIOCM_LOOP 0x8000
-
-/* ioctl (fd, TIOCSERGETLSR, &result) where result may be as below */
-
-#ifdef __KERNEL__
-
-/*
- * Translate a "termio" structure into a "termios". Ugh.
- */
-#define SET_LOW_TERMIOS_BITS(termios, termio, x) { \
- unsigned short __tmp; \
- get_user(__tmp,&(termio)->x); \
- *(unsigned short *) &(termios)->x = __tmp; \
-}
-
-#define user_termio_to_kernel_termios(termios, termio) \
-({ \
- SET_LOW_TERMIOS_BITS(termios, termio, c_iflag); \
- SET_LOW_TERMIOS_BITS(termios, termio, c_oflag); \
- SET_LOW_TERMIOS_BITS(termios, termio, c_cflag); \
- SET_LOW_TERMIOS_BITS(termios, termio, c_lflag); \
- copy_from_user((termios)->c_cc, (termio)->c_cc, NCC); \
-})
-
-/*
- * Translate a "termios" structure into a "termio". Ugh.
- */
-#define kernel_termios_to_user_termio(termio, termios) \
-({ \
- put_user((termios)->c_iflag, &(termio)->c_iflag); \
- put_user((termios)->c_oflag, &(termio)->c_oflag); \
- put_user((termios)->c_cflag, &(termio)->c_cflag); \
- put_user((termios)->c_lflag, &(termio)->c_lflag); \
- put_user((termios)->c_line, &(termio)->c_line); \
- copy_to_user((termio)->c_cc, (termios)->c_cc, NCC); \
-})
-
-#define user_termios_to_kernel_termios(k, u) copy_from_user(k, u, sizeof(struct termios2))
-#define kernel_termios_to_user_termios(u, k) copy_to_user(u, k, sizeof(struct termios2))
-#define user_termios_to_kernel_termios_1(k, u) copy_from_user(k, u, sizeof(struct termios))
-#define kernel_termios_to_user_termios_1(u, k) copy_to_user(u, k, sizeof(struct termios))
-
-#endif /* __KERNEL__ */
-
-#endif /* __ASM_ARM_TERMIOS_H */
--
1.7.9.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH v2 5/5] ARM: convert asm/irqflags.h to use asm-generic/irqflags.h
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
` (3 preceding siblings ...)
2012-08-05 1:23 ` [PATCH v2 4/5] ARM: use generic termios.h Rob Herring
@ 2012-08-05 1:24 ` Rob Herring
2012-08-05 9:34 ` [PATCH v2 0/5] Use more asm-generic headers Thomas Petazzoni
2012-08-06 14:35 ` Arnd Bergmann
6 siblings, 0 replies; 46+ messages in thread
From: Rob Herring @ 2012-08-05 1:24 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
By implmenting arch_local_irq_restore with constant flags, we can use
asm-generic/irqflags.h and remove some of the ARM version of irqflags.h.
Verified the code is equivalent by checking the disassembled code.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
---
arch/arm/include/asm/irqflags.h | 131 ++++++++++++++-------------------------
1 file changed, 45 insertions(+), 86 deletions(-)
diff --git a/arch/arm/include/asm/irqflags.h b/arch/arm/include/asm/irqflags.h
index 1e6cca5..67bfb53 100644
--- a/arch/arm/include/asm/irqflags.h
+++ b/arch/arm/include/asm/irqflags.h
@@ -8,88 +8,68 @@
/*
* CPU interrupt mask handling.
*/
-#if __LINUX_ARM_ARCH__ >= 6
+#define ARCH_IRQ_DISABLED PSR_I_BIT
+#define ARCH_IRQ_ENABLED 0
-static inline unsigned long arch_local_irq_save(void)
+static inline unsigned long arch_local_save_flags(void)
{
unsigned long flags;
asm volatile(
- " mrs %0, cpsr @ arch_local_irq_save\n"
- " cpsid i"
+ " mrs %0, cpsr @ arch_local_save_flags\n"
: "=r" (flags) : : "memory", "cc");
return flags;
}
-static inline void arch_local_irq_enable(void)
+static inline void constant_arch_local_irq_restore(unsigned long flags)
{
- asm volatile(
- " cpsie i @ arch_local_irq_enable"
- :
- :
- : "memory", "cc");
+#if __LINUX_ARM_ARCH__ >= 6
+ if (flags == ARCH_IRQ_ENABLED)
+ asm volatile(
+ " cpsie i @ constant_arch_local_irq_restore"
+ : : : "memory", "cc");
+ else if (flags == ARCH_IRQ_DISABLED)
+ asm volatile(
+ " cpsid i @ constant_arch_local_irq_restore"
+ : : : "memory", "cc");
+#else
+ unsigned long temp;
+ if (flags == ARCH_IRQ_ENABLED)
+ asm volatile(
+ " mrs %0, cpsr @ constant_arch_local_irq_restore\n"
+ " bic %0, %0, #128\n"
+ " msr cpsr_c, %0"
+ : "=r" (temp)
+ :
+ : "memory", "cc");
+ else if (flags == ARCH_IRQ_DISABLED)
+ asm volatile(
+ " mrs %0, cpsr @ constant_arch_local_irq_restore\n"
+ " orr %0, %0, #128\n"
+ " msr cpsr_c, %0"
+ : "=r" (temp)
+ :
+ : "memory", "cc");
+#endif
}
-static inline void arch_local_irq_disable(void)
+static inline void _arch_local_irq_restore(unsigned long flags)
{
- asm volatile(
- " cpsid i @ arch_local_irq_disable"
- :
+ asm volatile(
+ " msr cpsr_c, %0 @ _arch_local_irq_restore"
:
+ : "r" (flags)
: "memory", "cc");
}
+#define arch_local_irq_restore(flags) \
+ (__builtin_constant_p(flags) ? constant_arch_local_irq_restore(flags) : \
+ _arch_local_irq_restore(flags))
+
+#if __LINUX_ARM_ARCH__ >= 6
#define local_fiq_enable() __asm__("cpsie f @ __stf" : : : "memory", "cc")
#define local_fiq_disable() __asm__("cpsid f @ __clf" : : : "memory", "cc")
#else
-
-/*
- * Save the current interrupt enable state & disable IRQs
- */
-static inline unsigned long arch_local_irq_save(void)
-{
- unsigned long flags, temp;
-
- asm volatile(
- " mrs %0, cpsr @ arch_local_irq_save\n"
- " orr %1, %0, #128\n"
- " msr cpsr_c, %1"
- : "=r" (flags), "=r" (temp)
- :
- : "memory", "cc");
- return flags;
-}
-
-/*
- * Enable IRQs
- */
-static inline void arch_local_irq_enable(void)
-{
- unsigned long temp;
- asm volatile(
- " mrs %0, cpsr @ arch_local_irq_enable\n"
- " bic %0, %0, #128\n"
- " msr cpsr_c, %0"
- : "=r" (temp)
- :
- : "memory", "cc");
-}
-
-/*
- * Disable IRQs
- */
-static inline void arch_local_irq_disable(void)
-{
- unsigned long temp;
- asm volatile(
- " mrs %0, cpsr @ arch_local_irq_disable\n"
- " orr %0, %0, #128\n"
- " msr cpsr_c, %0"
- : "=r" (temp)
- :
- : "memory", "cc");
-}
-
/*
* Enable FIQs
*/
@@ -122,34 +102,13 @@ static inline void arch_local_irq_disable(void)
#endif
-/*
- * Save the current interrupt enable state.
- */
-static inline unsigned long arch_local_save_flags(void)
-{
- unsigned long flags;
- asm volatile(
- " mrs %0, cpsr @ local_save_flags"
- : "=r" (flags) : : "memory", "cc");
- return flags;
-}
-
-/*
- * restore saved IRQ & FIQ state
- */
-static inline void arch_local_irq_restore(unsigned long flags)
-{
- asm volatile(
- " msr cpsr_c, %0 @ local_irq_restore"
- :
- : "r" (flags)
- : "memory", "cc");
-}
-
static inline int arch_irqs_disabled_flags(unsigned long flags)
{
return flags & PSR_I_BIT;
}
+#define arch_irqs_disabled_flags arch_irqs_disabled_flags
+
+#include <asm-generic/irqflags.h>
#endif
#endif
--
1.7.9.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH v2 0/5] Use more asm-generic headers
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
` (4 preceding siblings ...)
2012-08-05 1:24 ` [PATCH v2 5/5] ARM: convert asm/irqflags.h to use asm-generic/irqflags.h Rob Herring
@ 2012-08-05 9:34 ` Thomas Petazzoni
2012-08-06 14:35 ` Arnd Bergmann
6 siblings, 0 replies; 46+ messages in thread
From: Thomas Petazzoni @ 2012-08-05 9:34 UTC (permalink / raw)
To: linux-arm-kernel
Le Sat, 4 Aug 2012 20:23:55 -0500,
Rob Herring <robherring2@gmail.com> a ?crit :
> From: Rob Herring <rob.herring@calxeda.com>
>
> Inspired by the AArgh64 claim that it should be separate from ARM and
> one reason was being able to use more asm-generic headers. Doing a
> diff of arch/arm/include/asm and include/asm-generic there are
> numerous asm headers which are functionally identical to their
> asm-generic counterparts and others which can use more of the generic
> versions.
>
> V2 separates unaligned.h and termios.h changes to separate patches.
> Also, it adds a fix for build failures on non-gzip decompressor
> compile.
>
> Rob
>
> Rob Herring (5):
> ARM: use generic version of identical asm headers
> ARM: add strstr declaration for decompressors
> ARM: use generic unaligned.h
> ARM: use generic termios.h
> ARM: convert asm/irqflags.h to use asm-generic/irqflags.h
Built and boot tested on Armada XP evaluation platform.
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
--
Thomas Petazzoni, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 0/5] Use more asm-generic headers
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
` (5 preceding siblings ...)
2012-08-05 9:34 ` [PATCH v2 0/5] Use more asm-generic headers Thomas Petazzoni
@ 2012-08-06 14:35 ` Arnd Bergmann
6 siblings, 0 replies; 46+ messages in thread
From: Arnd Bergmann @ 2012-08-06 14:35 UTC (permalink / raw)
To: linux-arm-kernel
On Sunday 05 August 2012, Rob Herring wrote:
>
> From: Rob Herring <rob.herring@calxeda.com>
>
> Inspired by the AArgh64 claim that it should be separate from ARM and one
> reason was being able to use more asm-generic headers. Doing a diff of
> arch/arm/include/asm and include/asm-generic there are numerous asm
> headers which are functionally identical to their asm-generic counterparts
> and others which can use more of the generic versions.
>
> V2 separates unaligned.h and termios.h changes to separate patches. Also,
> it adds a fix for build failures on non-gzip decompressor compile.
>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-08-05 1:23 ` [PATCH v2 3/5] ARM: use generic unaligned.h Rob Herring
@ 2012-10-08 16:43 ` Shawn Guo
2012-10-08 20:34 ` Rob Herring
0 siblings, 1 reply; 46+ messages in thread
From: Shawn Guo @ 2012-10-08 16:43 UTC (permalink / raw)
To: linux-arm-kernel
This patch has been merged into mainline as commit below.
d25c881 ARM: 7493/1: use generic unaligned.h
It introduces a regression for me. Check out the commit on mainline,
build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
kernel built on the parent commit below works all fine.
ef1c209 ARM: 7492/1: add strstr declaration for decompressors
Shawn
On Sat, Aug 04, 2012 at 08:23:58PM -0500, Rob Herring wrote:
> From: Rob Herring <rob.herring@calxeda.com>
>
> This moves ARM over to the asm-generic/unaligned.h header. This has the
> benefit of better code generated especially for ARMv7 on gcc 4.7+
> compilers.
>
> As Arnd Bergmann, points out: The asm-generic version uses the "struct"
> version for native-endian unaligned access and the "byteshift" version
> for the opposite endianess. The current ARM version however uses the
> "byteshift" implementation for both.
>
> Thanks to Nicolas Pitre for the excellent analysis:
>
> Test case:
>
> int foo (int *x) { return get_unaligned(x); }
> long long bar (long long *x) { return get_unaligned(x); }
>
> With the current ARM version:
>
> foo:
> ldrb r3, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
> ldrb r1, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
> ldrb r2, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
> mov r3, r3, asl #16 @ tmp154, MEM[(const u8 *)x_1(D) + 2B],
> ldrb r0, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
> orr r3, r3, r1, asl #8 @, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
> orr r3, r3, r2 @ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
> orr r0, r3, r0, asl #24 @,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
> bx lr @
>
> bar:
> stmfd sp!, {r4, r5, r6, r7} @,
> mov r2, #0 @ tmp184,
> ldrb r5, [r0, #6] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
> ldrb r4, [r0, #5] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
> ldrb ip, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
> ldrb r1, [r0, #4] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
> mov r5, r5, asl #16 @ tmp175, MEM[(const u8 *)x_1(D) + 6B],
> ldrb r7, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
> orr r5, r5, r4, asl #8 @, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
> ldrb r6, [r0, #7] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
> orr r5, r5, r1 @ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
> ldrb r4, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
> mov ip, ip, asl #16 @ tmp188, MEM[(const u8 *)x_1(D) + 2B],
> ldrb r1, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
> orr ip, ip, r7, asl #8 @, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
> orr r3, r5, r6, asl #24 @,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
> orr ip, ip, r4 @ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
> orr ip, ip, r1, asl #24 @, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
> mov r1, r3 @,
> orr r0, r2, ip @ tmp171, tmp184, tmp194
> ldmfd sp!, {r4, r5, r6, r7}
> bx lr
>
> In both cases the code is slightly suboptimal. One may wonder why
> wasting r2 with the constant 0 in the second case for example. And all
> the mov's could be folded in subsequent orr's, etc.
>
> Now with the asm-generic version:
>
> foo:
> ldr r0, [r0, #0] @ unaligned @,* x
> bx lr @
>
> bar:
> mov r3, r0 @ x, x
> ldr r0, [r0, #0] @ unaligned @,* x
> ldr r1, [r3, #4] @ unaligned @,
> bx lr @
>
> This is way better of course, but only because this was compiled for
> ARMv7. In this case the compiler knows that the hardware can do
> unaligned word access. This isn't that obvious for foo(), but if we
> remove the get_unaligned() from bar as follows:
>
> long long bar (long long *x) {return *x; }
>
> then the resulting code is:
>
> bar:
> ldmia r0, {r0, r1} @ x,,
> bx lr @
>
> So this proves that the presumed aligned vs unaligned cases does have
> influence on the instructions the compiler may use and that the above
> unaligned code results are not just an accident.
>
> Still... this isn't fully conclusive without at least looking at the
> resulting assembly fron a pre ARMv6 compilation. Let's see with an
> ARMv5 target:
>
> foo:
> ldrb r3, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
> ldrb r1, [r0, #1] @ zero_extendqisi2 @ tmp140,
> ldrb r2, [r0, #2] @ zero_extendqisi2 @ tmp143,
> ldrb r0, [r0, #3] @ zero_extendqisi2 @ tmp146,
> orr r3, r3, r1, asl #8 @, tmp142, tmp139, tmp140,
> orr r3, r3, r2, asl #16 @, tmp145, tmp142, tmp143,
> orr r0, r3, r0, asl #24 @,, tmp145, tmp146,
> bx lr @
>
> bar:
> stmfd sp!, {r4, r5, r6, r7} @,
> ldrb r2, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
> ldrb r7, [r0, #1] @ zero_extendqisi2 @ tmp140,
> ldrb r3, [r0, #4] @ zero_extendqisi2 @ tmp149,
> ldrb r6, [r0, #5] @ zero_extendqisi2 @ tmp150,
> ldrb r5, [r0, #2] @ zero_extendqisi2 @ tmp143,
> ldrb r4, [r0, #6] @ zero_extendqisi2 @ tmp153,
> ldrb r1, [r0, #7] @ zero_extendqisi2 @ tmp156,
> ldrb ip, [r0, #3] @ zero_extendqisi2 @ tmp146,
> orr r2, r2, r7, asl #8 @, tmp142, tmp139, tmp140,
> orr r3, r3, r6, asl #8 @, tmp152, tmp149, tmp150,
> orr r2, r2, r5, asl #16 @, tmp145, tmp142, tmp143,
> orr r3, r3, r4, asl #16 @, tmp155, tmp152, tmp153,
> orr r0, r2, ip, asl #24 @,, tmp145, tmp146,
> orr r1, r3, r1, asl #24 @,, tmp155, tmp156,
> ldmfd sp!, {r4, r5, r6, r7}
> bx lr
>
> Compared to the initial results, this is really nicely optimized and I
> couldn't do much better if I were to hand code it myself.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> Reviewed-by: Nicolas Pitre <nico@linaro.org>
> ---
> arch/arm/include/asm/Kbuild | 1 +
> arch/arm/include/asm/unaligned.h | 19 -------------------
> 2 files changed, 1 insertion(+), 19 deletions(-)
> delete mode 100644 arch/arm/include/asm/unaligned.h
>
> diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
> index 6cc8a13..a86cabf 100644
> --- a/arch/arm/include/asm/Kbuild
> +++ b/arch/arm/include/asm/Kbuild
> @@ -33,3 +33,4 @@ generic-y += sockios.h
> generic-y += termbits.h
> generic-y += timex.h
> generic-y += types.h
> +generic-y += unaligned.h
> diff --git a/arch/arm/include/asm/unaligned.h b/arch/arm/include/asm/unaligned.h
> deleted file mode 100644
> index 44593a8..0000000
> --- a/arch/arm/include/asm/unaligned.h
> +++ /dev/null
> @@ -1,19 +0,0 @@
> -#ifndef _ASM_ARM_UNALIGNED_H
> -#define _ASM_ARM_UNALIGNED_H
> -
> -#include <linux/unaligned/le_byteshift.h>
> -#include <linux/unaligned/be_byteshift.h>
> -#include <linux/unaligned/generic.h>
> -
> -/*
> - * Select endianness
> - */
> -#ifndef __ARMEB__
> -#define get_unaligned __get_unaligned_le
> -#define put_unaligned __put_unaligned_le
> -#else
> -#define get_unaligned __get_unaligned_be
> -#define put_unaligned __put_unaligned_be
> -#endif
> -
> -#endif /* _ASM_ARM_UNALIGNED_H */
> --
> 1.7.9.5
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-08 16:43 ` Shawn Guo
@ 2012-10-08 20:34 ` Rob Herring
2012-10-08 23:28 ` Shawn Guo
0 siblings, 1 reply; 46+ messages in thread
From: Rob Herring @ 2012-10-08 20:34 UTC (permalink / raw)
To: linux-arm-kernel
On 10/08/2012 11:43 AM, Shawn Guo wrote:
> This patch has been merged into mainline as commit below.
>
> d25c881 ARM: 7493/1: use generic unaligned.h
>
> It introduces a regression for me. Check out the commit on mainline,
> build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
> halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
> kernel built on the parent commit below works all fine.
It actually fails in the decompressor or that's the last output you get?
I compared the decompressor disassembly of both cases and get the same
number of ldrb/strb instructions, so I don't think it is directly
related to alignment.
I tried the XY decompressor as that is one difference, but that works
fine for me on highbank.
Does it work with an empty uncompress.h functions? That should be the
only difference in our decompressor code.
Rob
> ef1c209 ARM: 7492/1: add strstr declaration for decompressors
>
> Shawn
>
> On Sat, Aug 04, 2012 at 08:23:58PM -0500, Rob Herring wrote:
>> From: Rob Herring <rob.herring@calxeda.com>
>>
>> This moves ARM over to the asm-generic/unaligned.h header. This has the
>> benefit of better code generated especially for ARMv7 on gcc 4.7+
>> compilers.
>>
>> As Arnd Bergmann, points out: The asm-generic version uses the "struct"
>> version for native-endian unaligned access and the "byteshift" version
>> for the opposite endianess. The current ARM version however uses the
>> "byteshift" implementation for both.
>>
>> Thanks to Nicolas Pitre for the excellent analysis:
>>
>> Test case:
>>
>> int foo (int *x) { return get_unaligned(x); }
>> long long bar (long long *x) { return get_unaligned(x); }
>>
>> With the current ARM version:
>>
>> foo:
>> ldrb r3, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
>> ldrb r1, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
>> ldrb r2, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
>> mov r3, r3, asl #16 @ tmp154, MEM[(const u8 *)x_1(D) + 2B],
>> ldrb r0, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
>> orr r3, r3, r1, asl #8 @, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
>> orr r3, r3, r2 @ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
>> orr r0, r3, r0, asl #24 @,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
>> bx lr @
>>
>> bar:
>> stmfd sp!, {r4, r5, r6, r7} @,
>> mov r2, #0 @ tmp184,
>> ldrb r5, [r0, #6] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
>> ldrb r4, [r0, #5] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
>> ldrb ip, [r0, #2] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
>> ldrb r1, [r0, #4] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
>> mov r5, r5, asl #16 @ tmp175, MEM[(const u8 *)x_1(D) + 6B],
>> ldrb r7, [r0, #1] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
>> orr r5, r5, r4, asl #8 @, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
>> ldrb r6, [r0, #7] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
>> orr r5, r5, r1 @ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
>> ldrb r4, [r0, #0] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
>> mov ip, ip, asl #16 @ tmp188, MEM[(const u8 *)x_1(D) + 2B],
>> ldrb r1, [r0, #3] @ zero_extendqisi2 @ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
>> orr ip, ip, r7, asl #8 @, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
>> orr r3, r5, r6, asl #24 @,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
>> orr ip, ip, r4 @ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
>> orr ip, ip, r1, asl #24 @, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
>> mov r1, r3 @,
>> orr r0, r2, ip @ tmp171, tmp184, tmp194
>> ldmfd sp!, {r4, r5, r6, r7}
>> bx lr
>>
>> In both cases the code is slightly suboptimal. One may wonder why
>> wasting r2 with the constant 0 in the second case for example. And all
>> the mov's could be folded in subsequent orr's, etc.
>>
>> Now with the asm-generic version:
>>
>> foo:
>> ldr r0, [r0, #0] @ unaligned @,* x
>> bx lr @
>>
>> bar:
>> mov r3, r0 @ x, x
>> ldr r0, [r0, #0] @ unaligned @,* x
>> ldr r1, [r3, #4] @ unaligned @,
>> bx lr @
>>
>> This is way better of course, but only because this was compiled for
>> ARMv7. In this case the compiler knows that the hardware can do
>> unaligned word access. This isn't that obvious for foo(), but if we
>> remove the get_unaligned() from bar as follows:
>>
>> long long bar (long long *x) {return *x; }
>>
>> then the resulting code is:
>>
>> bar:
>> ldmia r0, {r0, r1} @ x,,
>> bx lr @
>>
>> So this proves that the presumed aligned vs unaligned cases does have
>> influence on the instructions the compiler may use and that the above
>> unaligned code results are not just an accident.
>>
>> Still... this isn't fully conclusive without at least looking at the
>> resulting assembly fron a pre ARMv6 compilation. Let's see with an
>> ARMv5 target:
>>
>> foo:
>> ldrb r3, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
>> ldrb r1, [r0, #1] @ zero_extendqisi2 @ tmp140,
>> ldrb r2, [r0, #2] @ zero_extendqisi2 @ tmp143,
>> ldrb r0, [r0, #3] @ zero_extendqisi2 @ tmp146,
>> orr r3, r3, r1, asl #8 @, tmp142, tmp139, tmp140,
>> orr r3, r3, r2, asl #16 @, tmp145, tmp142, tmp143,
>> orr r0, r3, r0, asl #24 @,, tmp145, tmp146,
>> bx lr @
>>
>> bar:
>> stmfd sp!, {r4, r5, r6, r7} @,
>> ldrb r2, [r0, #0] @ zero_extendqisi2 @ tmp139,* x
>> ldrb r7, [r0, #1] @ zero_extendqisi2 @ tmp140,
>> ldrb r3, [r0, #4] @ zero_extendqisi2 @ tmp149,
>> ldrb r6, [r0, #5] @ zero_extendqisi2 @ tmp150,
>> ldrb r5, [r0, #2] @ zero_extendqisi2 @ tmp143,
>> ldrb r4, [r0, #6] @ zero_extendqisi2 @ tmp153,
>> ldrb r1, [r0, #7] @ zero_extendqisi2 @ tmp156,
>> ldrb ip, [r0, #3] @ zero_extendqisi2 @ tmp146,
>> orr r2, r2, r7, asl #8 @, tmp142, tmp139, tmp140,
>> orr r3, r3, r6, asl #8 @, tmp152, tmp149, tmp150,
>> orr r2, r2, r5, asl #16 @, tmp145, tmp142, tmp143,
>> orr r3, r3, r4, asl #16 @, tmp155, tmp152, tmp153,
>> orr r0, r2, ip, asl #24 @,, tmp145, tmp146,
>> orr r1, r3, r1, asl #24 @,, tmp155, tmp156,
>> ldmfd sp!, {r4, r5, r6, r7}
>> bx lr
>>
>> Compared to the initial results, this is really nicely optimized and I
>> couldn't do much better if I were to hand code it myself.
>>
>> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
>> Reviewed-by: Nicolas Pitre <nico@linaro.org>
>> ---
>> arch/arm/include/asm/Kbuild | 1 +
>> arch/arm/include/asm/unaligned.h | 19 -------------------
>> 2 files changed, 1 insertion(+), 19 deletions(-)
>> delete mode 100644 arch/arm/include/asm/unaligned.h
>>
>> diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
>> index 6cc8a13..a86cabf 100644
>> --- a/arch/arm/include/asm/Kbuild
>> +++ b/arch/arm/include/asm/Kbuild
>> @@ -33,3 +33,4 @@ generic-y += sockios.h
>> generic-y += termbits.h
>> generic-y += timex.h
>> generic-y += types.h
>> +generic-y += unaligned.h
>> diff --git a/arch/arm/include/asm/unaligned.h b/arch/arm/include/asm/unaligned.h
>> deleted file mode 100644
>> index 44593a8..0000000
>> --- a/arch/arm/include/asm/unaligned.h
>> +++ /dev/null
>> @@ -1,19 +0,0 @@
>> -#ifndef _ASM_ARM_UNALIGNED_H
>> -#define _ASM_ARM_UNALIGNED_H
>> -
>> -#include <linux/unaligned/le_byteshift.h>
>> -#include <linux/unaligned/be_byteshift.h>
>> -#include <linux/unaligned/generic.h>
>> -
>> -/*
>> - * Select endianness
>> - */
>> -#ifndef __ARMEB__
>> -#define get_unaligned __get_unaligned_le
>> -#define put_unaligned __put_unaligned_le
>> -#else
>> -#define get_unaligned __get_unaligned_be
>> -#define put_unaligned __put_unaligned_be
>> -#endif
>> -
>> -#endif /* _ASM_ARM_UNALIGNED_H */
>> --
>> 1.7.9.5
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-08 20:34 ` Rob Herring
@ 2012-10-08 23:28 ` Shawn Guo
2012-10-09 2:27 ` Rob Herring
0 siblings, 1 reply; 46+ messages in thread
From: Shawn Guo @ 2012-10-08 23:28 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 03:34:57PM -0500, Rob Herring wrote:
> On 10/08/2012 11:43 AM, Shawn Guo wrote:
> > This patch has been merged into mainline as commit below.
> >
> > d25c881 ARM: 7493/1: use generic unaligned.h
> >
> > It introduces a regression for me. Check out the commit on mainline,
> > build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
> > halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
> > kernel built on the parent commit below works all fine.
>
> It actually fails in the decompressor or that's the last output you get?
I think it fails in the decompressor, because what I see is
Uncompressing Linux...
not
Uncompressing Linux... done, booting the kernel.
> I compared the decompressor disassembly of both cases and get the same
> number of ldrb/strb instructions, so I don't think it is directly
> related to alignment.
>
> I tried the XY decompressor as that is one difference, but that works
> fine for me on highbank.
>
> Does it work with an empty uncompress.h functions? That should be the
> only difference in our decompressor code.
No, empty putc() in uncompress.h does not help.
New finding is that it only fails with LZO decompressor while the other
3 Gzip, LZMA and XZ all work good. We happen to have LZO as the default
one in imx_v6_v7_defconfig.
Shawn
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-08 23:28 ` Shawn Guo
@ 2012-10-09 2:27 ` Rob Herring
2012-10-09 2:45 ` Shawn Guo
2012-10-09 4:01 ` Nicolas Pitre
0 siblings, 2 replies; 46+ messages in thread
From: Rob Herring @ 2012-10-09 2:27 UTC (permalink / raw)
To: linux-arm-kernel
On 10/08/2012 06:28 PM, Shawn Guo wrote:
> On Mon, Oct 08, 2012 at 03:34:57PM -0500, Rob Herring wrote:
>> On 10/08/2012 11:43 AM, Shawn Guo wrote:
>>> This patch has been merged into mainline as commit below.
>>>
>>> d25c881 ARM: 7493/1: use generic unaligned.h
>>>
>>> It introduces a regression for me. Check out the commit on mainline,
>>> build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
>>> halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
>>> kernel built on the parent commit below works all fine.
>>
>> It actually fails in the decompressor or that's the last output you get?
>
> I think it fails in the decompressor, because what I see is
>
> Uncompressing Linux...
>
> not
>
> Uncompressing Linux... done, booting the kernel.
>
>> I compared the decompressor disassembly of both cases and get the same
>> number of ldrb/strb instructions, so I don't think it is directly
>> related to alignment.
>>
>> I tried the XY decompressor as that is one difference, but that works
>> fine for me on highbank.
>>
>> Does it work with an empty uncompress.h functions? That should be the
>> only difference in our decompressor code.
>
> No, empty putc() in uncompress.h does not help.
>
> New finding is that it only fails with LZO decompressor while the other
> 3 Gzip, LZMA and XZ all work good. We happen to have LZO as the default
> one in imx_v6_v7_defconfig.
I must have had an old config with XZ. LZO fails because of this:
lib/decompress_unlzo.c: version = get_unaligned_be16(parse);
lib/decompress_unlzo.c: if (get_unaligned_be32(parse) & HEADER_HAS_FILTER)
lib/decompress_unlzo.c: dst_len = get_unaligned_be32(in_buf);
lib/decompress_unlzo.c: src_len = get_unaligned_be32(in_buf);
This was what I was afraid of. The decompressor runs with the sysctrl
register A bit in whatever state the bootloader left it in. In the case
of u-boot it is set, and the maintainers are pretty set on not allowing
unaligned accesses if you've seen the recent discussion.
This should fix things.
Rob
8<---------------------------------------------------------------------
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index bc67cbf..1f87d22 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -654,6 +654,7 @@ __armv7_mmu_cache_on:
#endif
mrc p15, 0, r0, c1, c0, 0 @ read control reg
bic r0, r0, #1 << 28 @ clear SCTLR.TRE
+ bic r0, r0, #1 << 1 @ clear SCTLR.A
orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
orr r0, r0, #0x003c @ write buffer
#ifdef CONFIG_MMU
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-09 2:27 ` Rob Herring
@ 2012-10-09 2:45 ` Shawn Guo
2012-10-09 4:01 ` Nicolas Pitre
1 sibling, 0 replies; 46+ messages in thread
From: Shawn Guo @ 2012-10-09 2:45 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 09:27:31PM -0500, Rob Herring wrote:
> I must have had an old config with XZ. LZO fails because of this:
>
> lib/decompress_unlzo.c: version = get_unaligned_be16(parse);
> lib/decompress_unlzo.c: if (get_unaligned_be32(parse) & HEADER_HAS_FILTER)
> lib/decompress_unlzo.c: dst_len = get_unaligned_be32(in_buf);
> lib/decompress_unlzo.c: src_len = get_unaligned_be32(in_buf);
>
> This was what I was afraid of. The decompressor runs with the sysctrl
> register A bit in whatever state the bootloader left it in. In the case
> of u-boot it is set, and the maintainers are pretty set on not allowing
> unaligned accesses if you've seen the recent discussion.
>
> This should fix things.
>
> Rob
>
> 8<---------------------------------------------------------------------
>
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index bc67cbf..1f87d22 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -654,6 +654,7 @@ __armv7_mmu_cache_on:
> #endif
> mrc p15, 0, r0, c1, c0, 0 @ read control reg
> bic r0, r0, #1 << 28 @ clear SCTLR.TRE
> + bic r0, r0, #1 << 1 @ clear SCTLR.A
> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
> orr r0, r0, #0x003c @ write buffer
> #ifdef CONFIG_MMU
Tested-by: Shawn Guo <shawn.guo@linaro.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-09 2:27 ` Rob Herring
2012-10-09 2:45 ` Shawn Guo
@ 2012-10-09 4:01 ` Nicolas Pitre
2012-10-10 13:29 ` Rob Herring
1 sibling, 1 reply; 46+ messages in thread
From: Nicolas Pitre @ 2012-10-09 4:01 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 8 Oct 2012, Rob Herring wrote:
> On 10/08/2012 06:28 PM, Shawn Guo wrote:
> > On Mon, Oct 08, 2012 at 03:34:57PM -0500, Rob Herring wrote:
> >> On 10/08/2012 11:43 AM, Shawn Guo wrote:
> >>> This patch has been merged into mainline as commit below.
> >>>
> >>> d25c881 ARM: 7493/1: use generic unaligned.h
> >>>
> >>> It introduces a regression for me. Check out the commit on mainline,
> >>> build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
> >>> halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
> >>> kernel built on the parent commit below works all fine.
> >>
> >> It actually fails in the decompressor or that's the last output you get?
> >
> > I think it fails in the decompressor, because what I see is
> >
> > Uncompressing Linux...
> >
> > not
> >
> > Uncompressing Linux... done, booting the kernel.
> >
> >> I compared the decompressor disassembly of both cases and get the same
> >> number of ldrb/strb instructions, so I don't think it is directly
> >> related to alignment.
> >>
> >> I tried the XY decompressor as that is one difference, but that works
> >> fine for me on highbank.
> >>
> >> Does it work with an empty uncompress.h functions? That should be the
> >> only difference in our decompressor code.
> >
> > No, empty putc() in uncompress.h does not help.
> >
> > New finding is that it only fails with LZO decompressor while the other
> > 3 Gzip, LZMA and XZ all work good. We happen to have LZO as the default
> > one in imx_v6_v7_defconfig.
>
> I must have had an old config with XZ. LZO fails because of this:
>
> lib/decompress_unlzo.c: version = get_unaligned_be16(parse);
> lib/decompress_unlzo.c: if (get_unaligned_be32(parse) & HEADER_HAS_FILTER)
> lib/decompress_unlzo.c: dst_len = get_unaligned_be32(in_buf);
> lib/decompress_unlzo.c: src_len = get_unaligned_be32(in_buf);
>
> This was what I was afraid of. The decompressor runs with the sysctrl
> register A bit in whatever state the bootloader left it in. In the case
> of u-boot it is set, and the maintainers are pretty set on not allowing
> unaligned accesses if you've seen the recent discussion.
This is not an u-Boot issue. The kernel code expects misaligned
accesses to be handled by the hardware on ARMv6+ now. So it better
enforce proper A bit state itself.
> This should fix things.
>
> Rob
>
> 8<---------------------------------------------------------------------
>
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index bc67cbf..1f87d22 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -654,6 +654,7 @@ __armv7_mmu_cache_on:
> #endif
> mrc p15, 0, r0, c1, c0, 0 @ read control reg
> bic r0, r0, #1 << 28 @ clear SCTLR.TRE
> + bic r0, r0, #1 << 1 @ clear SCTLR.A
> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
> orr r0, r0, #0x003c @ write buffer
> #ifdef CONFIG_MMU
>
That's OK for ARMv7, but in the ARMv6 case we use __armv4_mmu_cache_on.
So this patch is incomplete.
Nicolas
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-09 4:01 ` Nicolas Pitre
@ 2012-10-10 13:29 ` Rob Herring
2012-10-10 13:57 ` Nicolas Pitre
0 siblings, 1 reply; 46+ messages in thread
From: Rob Herring @ 2012-10-10 13:29 UTC (permalink / raw)
To: linux-arm-kernel
On 10/08/2012 11:01 PM, Nicolas Pitre wrote:
> On Mon, 8 Oct 2012, Rob Herring wrote:
>
>> On 10/08/2012 06:28 PM, Shawn Guo wrote:
>>> On Mon, Oct 08, 2012 at 03:34:57PM -0500, Rob Herring wrote:
>>>> On 10/08/2012 11:43 AM, Shawn Guo wrote:
>>>>> This patch has been merged into mainline as commit below.
>>>>>
>>>>> d25c881 ARM: 7493/1: use generic unaligned.h
>>>>>
>>>>> It introduces a regression for me. Check out the commit on mainline,
>>>>> build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
>>>>> halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
>>>>> kernel built on the parent commit below works all fine.
>>>>
>>>> It actually fails in the decompressor or that's the last output you get?
>>>
>>> I think it fails in the decompressor, because what I see is
>>>
>>> Uncompressing Linux...
>>>
>>> not
>>>
>>> Uncompressing Linux... done, booting the kernel.
>>>
>>>> I compared the decompressor disassembly of both cases and get the same
>>>> number of ldrb/strb instructions, so I don't think it is directly
>>>> related to alignment.
>>>>
>>>> I tried the XY decompressor as that is one difference, but that works
>>>> fine for me on highbank.
>>>>
>>>> Does it work with an empty uncompress.h functions? That should be the
>>>> only difference in our decompressor code.
>>>
>>> No, empty putc() in uncompress.h does not help.
>>>
>>> New finding is that it only fails with LZO decompressor while the other
>>> 3 Gzip, LZMA and XZ all work good. We happen to have LZO as the default
>>> one in imx_v6_v7_defconfig.
>>
>> I must have had an old config with XZ. LZO fails because of this:
>>
>> lib/decompress_unlzo.c: version = get_unaligned_be16(parse);
>> lib/decompress_unlzo.c: if (get_unaligned_be32(parse) & HEADER_HAS_FILTER)
>> lib/decompress_unlzo.c: dst_len = get_unaligned_be32(in_buf);
>> lib/decompress_unlzo.c: src_len = get_unaligned_be32(in_buf);
>>
>> This was what I was afraid of. The decompressor runs with the sysctrl
>> register A bit in whatever state the bootloader left it in. In the case
>> of u-boot it is set, and the maintainers are pretty set on not allowing
>> unaligned accesses if you've seen the recent discussion.
>
> This is not an u-Boot issue. The kernel code expects misaligned
> accesses to be handled by the hardware on ARMv6+ now. So it better
> enforce proper A bit state itself.
>
>> This should fix things.
>>
>> Rob
>>
>> 8<---------------------------------------------------------------------
>>
>> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
>> index bc67cbf..1f87d22 100644
>> --- a/arch/arm/boot/compressed/head.S
>> +++ b/arch/arm/boot/compressed/head.S
>> @@ -654,6 +654,7 @@ __armv7_mmu_cache_on:
>> #endif
>> mrc p15, 0, r0, c1, c0, 0 @ read control reg
>> bic r0, r0, #1 << 28 @ clear SCTLR.TRE
>> + bic r0, r0, #1 << 1 @ clear SCTLR.A
>> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
>> orr r0, r0, #0x003c @ write buffer
>> #ifdef CONFIG_MMU
>>
>
> That's OK for ARMv7, but in the ARMv6 case we use __armv4_mmu_cache_on.
> So this patch is incomplete.
It is still optional on v6 and the compiler will not emit unaligned
accesses for v6 (which is why it worked when v6 platforms were enabled).
That does raise a bigger question. If we're doing combined v6 and v7
builds, do we want this to require unaligned accesses are enabled and if
so, can we force that on in the compiler?
Rob
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH v2 3/5] ARM: use generic unaligned.h
2012-10-10 13:29 ` Rob Herring
@ 2012-10-10 13:57 ` Nicolas Pitre
2012-10-11 12:43 ` [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores Rob Herring
0 siblings, 1 reply; 46+ messages in thread
From: Nicolas Pitre @ 2012-10-10 13:57 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, 10 Oct 2012, Rob Herring wrote:
> On 10/08/2012 11:01 PM, Nicolas Pitre wrote:
> > On Mon, 8 Oct 2012, Rob Herring wrote:
> >
> >> On 10/08/2012 06:28 PM, Shawn Guo wrote:
> >>> On Mon, Oct 08, 2012 at 03:34:57PM -0500, Rob Herring wrote:
> >>>> On 10/08/2012 11:43 AM, Shawn Guo wrote:
> >>>>> This patch has been merged into mainline as commit below.
> >>>>>
> >>>>> d25c881 ARM: 7493/1: use generic unaligned.h
> >>>>>
> >>>>> It introduces a regression for me. Check out the commit on mainline,
> >>>>> build a v7 only kernel (imx5/6) with imx_v6_v7_defconfig, the kernel
> >>>>> halts in decompressor. But v6/v7 kernel (imx3/5/6) works fine. The
> >>>>> kernel built on the parent commit below works all fine.
> >>>>
> >>>> It actually fails in the decompressor or that's the last output you get?
> >>>
> >>> I think it fails in the decompressor, because what I see is
> >>>
> >>> Uncompressing Linux...
> >>>
> >>> not
> >>>
> >>> Uncompressing Linux... done, booting the kernel.
> >>>
> >>>> I compared the decompressor disassembly of both cases and get the same
> >>>> number of ldrb/strb instructions, so I don't think it is directly
> >>>> related to alignment.
> >>>>
> >>>> I tried the XY decompressor as that is one difference, but that works
> >>>> fine for me on highbank.
> >>>>
> >>>> Does it work with an empty uncompress.h functions? That should be the
> >>>> only difference in our decompressor code.
> >>>
> >>> No, empty putc() in uncompress.h does not help.
> >>>
> >>> New finding is that it only fails with LZO decompressor while the other
> >>> 3 Gzip, LZMA and XZ all work good. We happen to have LZO as the default
> >>> one in imx_v6_v7_defconfig.
> >>
> >> I must have had an old config with XZ. LZO fails because of this:
> >>
> >> lib/decompress_unlzo.c: version = get_unaligned_be16(parse);
> >> lib/decompress_unlzo.c: if (get_unaligned_be32(parse) & HEADER_HAS_FILTER)
> >> lib/decompress_unlzo.c: dst_len = get_unaligned_be32(in_buf);
> >> lib/decompress_unlzo.c: src_len = get_unaligned_be32(in_buf);
> >>
> >> This was what I was afraid of. The decompressor runs with the sysctrl
> >> register A bit in whatever state the bootloader left it in. In the case
> >> of u-boot it is set, and the maintainers are pretty set on not allowing
> >> unaligned accesses if you've seen the recent discussion.
> >
> > This is not an u-Boot issue. The kernel code expects misaligned
> > accesses to be handled by the hardware on ARMv6+ now. So it better
> > enforce proper A bit state itself.
> >
> >> This should fix things.
> >>
> >> Rob
> >>
> >> 8<---------------------------------------------------------------------
> >>
> >> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> >> index bc67cbf..1f87d22 100644
> >> --- a/arch/arm/boot/compressed/head.S
> >> +++ b/arch/arm/boot/compressed/head.S
> >> @@ -654,6 +654,7 @@ __armv7_mmu_cache_on:
> >> #endif
> >> mrc p15, 0, r0, c1, c0, 0 @ read control reg
> >> bic r0, r0, #1 << 28 @ clear SCTLR.TRE
> >> + bic r0, r0, #1 << 1 @ clear SCTLR.A
> >> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
> >> orr r0, r0, #0x003c @ write buffer
> >> #ifdef CONFIG_MMU
> >>
> >
> > That's OK for ARMv7, but in the ARMv6 case we use __armv4_mmu_cache_on.
> > So this patch is incomplete.
>
> It is still optional on v6 and the compiler will not emit unaligned
> accesses for v6 (which is why it worked when v6 platforms were enabled).
Hmmm, right. However if you look at commit 8428e84d42, the A bit is
cleared whenever we compile for ARMv6 or higher. Of course, the
decompressor doesn't have to deal with unknown user space binaries and
in that case we might trust that the compiler will never emit known to
be unaligned loads.
In that case...
Acked-by: Nicolas Pitre <nico@linaro.org>
> That does raise a bigger question. If we're doing combined v6 and v7
> builds, do we want this to require unaligned accesses are enabled and if
> so, can we force that on in the compiler?
When we compile ARMv6 and ARMv7 targets together, the compiler is told
to compile for ARMv6. We know that unaligned accesses will be done
usingLDRB/STRB in that case.
Nicolas
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-10 13:57 ` Nicolas Pitre
@ 2012-10-11 12:43 ` Rob Herring
2012-10-11 13:09 ` Russell King - ARM Linux
2012-10-25 9:34 ` Johannes Stezenbach
0 siblings, 2 replies; 46+ messages in thread
From: Rob Herring @ 2012-10-11 12:43 UTC (permalink / raw)
To: linux-arm-kernel
From: Rob Herring <rob.herring@calxeda.com>
With recent compilers and move to generic unaligned.h in commit d25c881
(ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
by the LZO decompressor on v7 cores. So we need to make sure unaligned
accesses are allowed by clearing the SCTLR A bit.
While v6 can support unaligned accesses, it is optional and current
compilers won't emit unaligned accesses. So we don't clear the A bit for
v6.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
---
The contents of this were already reviewed on this thread, so I sent this
to the patch system and this was Russell's reply:
> NAK for two reasons.
>
> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> in my mailbox)
>
> 2. The behaviour of unaligned accesses vary depending on CPU. Some
> fix-up the access, others load the word and then rotate it. If we have
> decompressors which perform unaligned accesses, we need to fix this
> properly to avoid the CPU specific behaviour, rather than tweaking
> control bits to hide the problem.
I'm simply matching the behavior of the kernel itself. The A bit is cleared
for v7 kernels and compilers only generate unaligned accesses for v7.
Without this the initial state of the A bit is undefined as a bootloader
could have cleared it already. We should document the required state or set
it to what we want.
Rob
arch/arm/boot/compressed/head.S | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index bc67cbf..b2e30b8 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -654,6 +654,7 @@ __armv7_mmu_cache_on:
#endif
mrc p15, 0, r0, c1, c0, 0 @ read control reg
bic r0, r0, #1 << 28 @ clear SCTLR.TRE
+ bic r0, r0, #1 << 1 @ clear SCTLR.A
orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
orr r0, r0, #0x003c @ write buffer
#ifdef CONFIG_MMU
--
1.7.9.5
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 12:43 ` [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores Rob Herring
@ 2012-10-11 13:09 ` Russell King - ARM Linux
2012-10-11 13:31 ` Rob Herring
2012-10-25 9:34 ` Johannes Stezenbach
1 sibling, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11 13:09 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> The contents of this were already reviewed on this thread, so I sent this
> to the patch system and this was Russell's reply:
So that's why I couldn't find it - the mailing list thread has a different
subject line to the patch. Don't do that. Given the amount of list
traffic we have today, that's as good as not having been posted at all.
> > NAK for two reasons.
> >
> > 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> > in my mailbox)
> >
> > 2. The behaviour of unaligned accesses vary depending on CPU. Some
> > fix-up the access, others load the word and then rotate it. If we have
> > decompressors which perform unaligned accesses, we need to fix this
> > properly to avoid the CPU specific behaviour, rather than tweaking
> > control bits to hide the problem.
>
> I'm simply matching the behavior of the kernel itself. The A bit is cleared
> for v7 kernels and compilers only generate unaligned accesses for v7.
> Without this the initial state of the A bit is undefined as a bootloader
> could have cleared it already. We should document the required state or set
> it to what we want.
Irrespective of this, (2) still stands. Unaligned accesses in the
decompressor without a fixup (which will be very hard to provide)
will return different data depending on the CPU as I mention in point
2.
So, using unaligned accesses in the decompressor may not give expected
results in all cases, so they're best avoided.
And if they're avoided, then we don't care about the setting of the A
bit.
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 13:09 ` Russell King - ARM Linux
@ 2012-10-11 13:31 ` Rob Herring
2012-10-11 13:41 ` Russell King - ARM Linux
2012-10-11 13:59 ` Russell King - ARM Linux
0 siblings, 2 replies; 46+ messages in thread
From: Rob Herring @ 2012-10-11 13:31 UTC (permalink / raw)
To: linux-arm-kernel
On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>> The contents of this were already reviewed on this thread, so I sent this
>> to the patch system and this was Russell's reply:
>
> So that's why I couldn't find it - the mailing list thread has a different
> subject line to the patch. Don't do that. Given the amount of list
> traffic we have today, that's as good as not having been posted at all.
>
>>> NAK for two reasons.
>>>
>>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
>>> in my mailbox)
>>>
>>> 2. The behaviour of unaligned accesses vary depending on CPU. Some
>>> fix-up the access, others load the word and then rotate it. If we have
>>> decompressors which perform unaligned accesses, we need to fix this
>>> properly to avoid the CPU specific behaviour, rather than tweaking
>>> control bits to hide the problem.
>>
>> I'm simply matching the behavior of the kernel itself. The A bit is cleared
>> for v7 kernels and compilers only generate unaligned accesses for v7.
>> Without this the initial state of the A bit is undefined as a bootloader
>> could have cleared it already. We should document the required state or set
>> it to what we want.
>
> Irrespective of this, (2) still stands. Unaligned accesses in the
> decompressor without a fixup (which will be very hard to provide)
> will return different data depending on the CPU as I mention in point
> 2.
This only affects v7 cores. It should not vary for v7 cores as unaligned
access is a required feature. So how is it going to vary on v7 CPUs?
We've got bigger problems if there are v7 cores that don't handle
unaligned accesses.
Rob
> So, using unaligned accesses in the decompressor may not give expected
> results in all cases, so they're best avoided.
>
> And if they're avoided, then we don't care about the setting of the A
> bit.
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 13:31 ` Rob Herring
@ 2012-10-11 13:41 ` Russell King - ARM Linux
2012-10-11 15:44 ` Nicolas Pitre
2012-10-11 13:59 ` Russell King - ARM Linux
1 sibling, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11 13:41 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
> > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >> The contents of this were already reviewed on this thread, so I sent this
> >> to the patch system and this was Russell's reply:
> >
> > So that's why I couldn't find it - the mailing list thread has a different
> > subject line to the patch. Don't do that. Given the amount of list
> > traffic we have today, that's as good as not having been posted at all.
> >
> >>> NAK for two reasons.
> >>>
> >>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> >>> in my mailbox)
> >>>
> >>> 2. The behaviour of unaligned accesses vary depending on CPU. Some
> >>> fix-up the access, others load the word and then rotate it. If we have
> >>> decompressors which perform unaligned accesses, we need to fix this
> >>> properly to avoid the CPU specific behaviour, rather than tweaking
> >>> control bits to hide the problem.
> >>
> >> I'm simply matching the behavior of the kernel itself. The A bit is cleared
> >> for v7 kernels and compilers only generate unaligned accesses for v7.
> >> Without this the initial state of the A bit is undefined as a bootloader
> >> could have cleared it already. We should document the required state or set
> >> it to what we want.
> >
> > Irrespective of this, (2) still stands. Unaligned accesses in the
> > decompressor without a fixup (which will be very hard to provide)
> > will return different data depending on the CPU as I mention in point
> > 2.
>
> This only affects v7 cores. It should not vary for v7 cores as unaligned
> access is a required feature. So how is it going to vary on v7 CPUs?
> We've got bigger problems if there are v7 cores that don't handle
> unaligned accesses.
Rob,
Your patch may only affect v7 cores, but you've raised the issue of the
decompressor performing unaligned accesses in general. Shall I re-repeat
my point over that or is the problem here going to finally sink in?
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 13:31 ` Rob Herring
2012-10-11 13:41 ` Russell King - ARM Linux
@ 2012-10-11 13:59 ` Russell King - ARM Linux
2012-10-11 15:58 ` Nicolas Pitre
1 sibling, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2012-10-11 13:59 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> This only affects v7 cores. It should not vary for v7 cores as unaligned
> access is a required feature. So how is it going to vary on v7 CPUs?
> We've got bigger problems if there are v7 cores that don't handle
> unaligned accesses.
Oh, and this gives me a third reason to NAK this patch. Why only ensure
that the A bit is clear for v7 CPUs? Why not v6, v5, v4 too? Why does
ARMv7 get this special treatment?
Any argument you can make for clearing the bit on ARMv7 also applies to
the other architectures too, but not even that negates my point (2) which
is far more fundamental.
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 13:41 ` Russell King - ARM Linux
@ 2012-10-11 15:44 ` Nicolas Pitre
2012-10-23 20:32 ` Gregory CLEMENT
0 siblings, 1 reply; 46+ messages in thread
From: Nicolas Pitre @ 2012-10-11 15:44 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 11 Oct 2012, Russell King - ARM Linux wrote:
> On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> > On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
> > > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > >> The contents of this were already reviewed on this thread, so I sent this
> > >> to the patch system and this was Russell's reply:
> > >
> > > So that's why I couldn't find it - the mailing list thread has a different
> > > subject line to the patch. Don't do that. Given the amount of list
> > > traffic we have today, that's as good as not having been posted at all.
> > >
> > >>> NAK for two reasons.
> > >>>
> > >>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
> > >>> in my mailbox)
> > >>>
> > >>> 2. The behaviour of unaligned accesses vary depending on CPU. Some
> > >>> fix-up the access, others load the word and then rotate it. If we have
> > >>> decompressors which perform unaligned accesses, we need to fix this
> > >>> properly to avoid the CPU specific behaviour, rather than tweaking
> > >>> control bits to hide the problem.
> > >>
> > >> I'm simply matching the behavior of the kernel itself. The A bit is cleared
> > >> for v7 kernels and compilers only generate unaligned accesses for v7.
> > >> Without this the initial state of the A bit is undefined as a bootloader
> > >> could have cleared it already. We should document the required state or set
> > >> it to what we want.
> > >
> > > Irrespective of this, (2) still stands. Unaligned accesses in the
> > > decompressor without a fixup (which will be very hard to provide)
> > > will return different data depending on the CPU as I mention in point
> > > 2.
> >
> > This only affects v7 cores. It should not vary for v7 cores as unaligned
> > access is a required feature. So how is it going to vary on v7 CPUs?
> > We've got bigger problems if there are v7 cores that don't handle
> > unaligned accesses.
>
> Rob,
>
> Your patch may only affect v7 cores, but you've raised the issue of the
> decompressor performing unaligned accesses in general. Shall I re-repeat
> my point over that or is the problem here going to finally sink in?
The decompressor is not performing direct unaligned accesses. It uses
the get_unaligned() and put_unaligned() accessors. That means that
we're in control of how this is happening.
So let's talk about the how. On pre ARMv7, those accesses are performed
with a series of byte accesses. When compiling for ARMv7, gcc knows and
that the hardware can do unaligned accesses, and it does optimize its
output by using ldr/str instructions. But the A bit has to be cleared
in that case, and only in that case. This is why the patch clears the A
bit only for ARMv7.
So this patch is only setting up the hardware to match gcc's
expectations when generating code from the use of get_unaligned() and
put_unaligned() when optimizing for ARMv7.
As always, any code doing unaligned access and _not_ using those
accessors is broken.
Nicolas
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 13:59 ` Russell King - ARM Linux
@ 2012-10-11 15:58 ` Nicolas Pitre
0 siblings, 0 replies; 46+ messages in thread
From: Nicolas Pitre @ 2012-10-11 15:58 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 11 Oct 2012, Russell King - ARM Linux wrote:
> On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
> > This only affects v7 cores. It should not vary for v7 cores as unaligned
> > access is a required feature. So how is it going to vary on v7 CPUs?
> > We've got bigger problems if there are v7 cores that don't handle
> > unaligned accesses.
>
> Oh, and this gives me a third reason to NAK this patch. Why only ensure
> that the A bit is clear for v7 CPUs? Why not v6, v5, v4 too? Why does
> ARMv7 get this special treatment?
As I said, gcc knows that ARMv7 can perform word sized accesses even
with misaligned pointers. So when it is passed a pointer marked with
the packed attribute, it will generate a series of byte accesses when
compiling for anything but ARMv7, and use a single ldr or str when
compiling for ARMv7.
Nicolas
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 15:44 ` Nicolas Pitre
@ 2012-10-23 20:32 ` Gregory CLEMENT
2012-10-25 12:07 ` Russell King - ARM Linux
0 siblings, 1 reply; 46+ messages in thread
From: Gregory CLEMENT @ 2012-10-23 20:32 UTC (permalink / raw)
To: linux-arm-kernel
On 10/11/2012 05:44 PM, Nicolas Pitre wrote:
> On Thu, 11 Oct 2012, Russell King - ARM Linux wrote:
>
>> On Thu, Oct 11, 2012 at 08:31:47AM -0500, Rob Herring wrote:
>>> On 10/11/2012 08:09 AM, Russell King - ARM Linux wrote:
>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>>>> The contents of this were already reviewed on this thread, so I sent this
>>>>> to the patch system and this was Russell's reply:
>>>>
>>>> So that's why I couldn't find it - the mailing list thread has a different
>>>> subject line to the patch. Don't do that. Given the amount of list
>>>> traffic we have today, that's as good as not having been posted at all.
>>>>
>>>>>> NAK for two reasons.
>>>>>>
>>>>>> 1. It hasn't been on the list (I can't find a match for "clear SCTLR.A"
>>>>>> in my mailbox)
>>>>>>
>>>>>> 2. The behaviour of unaligned accesses vary depending on CPU. Some
>>>>>> fix-up the access, others load the word and then rotate it. If we have
>>>>>> decompressors which perform unaligned accesses, we need to fix this
>>>>>> properly to avoid the CPU specific behaviour, rather than tweaking
>>>>>> control bits to hide the problem.
>>>>>
>>>>> I'm simply matching the behavior of the kernel itself. The A bit is cleared
>>>>> for v7 kernels and compilers only generate unaligned accesses for v7.
>>>>> Without this the initial state of the A bit is undefined as a bootloader
>>>>> could have cleared it already. We should document the required state or set
>>>>> it to what we want.
>>>>
>>>> Irrespective of this, (2) still stands. Unaligned accesses in the
>>>> decompressor without a fixup (which will be very hard to provide)
>>>> will return different data depending on the CPU as I mention in point
>>>> 2.
>>>
>>> This only affects v7 cores. It should not vary for v7 cores as unaligned
>>> access is a required feature. So how is it going to vary on v7 CPUs?
>>> We've got bigger problems if there are v7 cores that don't handle
>>> unaligned accesses.
>>
>> Rob,
>>
>> Your patch may only affect v7 cores, but you've raised the issue of the
>> decompressor performing unaligned accesses in general. Shall I re-repeat
>> my point over that or is the problem here going to finally sink in?
>
> The decompressor is not performing direct unaligned accesses. It uses
> the get_unaligned() and put_unaligned() accessors. That means that
> we're in control of how this is happening.
>
> So let's talk about the how. On pre ARMv7, those accesses are performed
> with a series of byte accesses. When compiling for ARMv7, gcc knows and
> that the hardware can do unaligned accesses, and it does optimize its
> output by using ldr/str instructions. But the A bit has to be cleared
> in that case, and only in that case. This is why the patch clears the A
> bit only for ARMv7.
>
> So this patch is only setting up the hardware to match gcc's
> expectations when generating code from the use of get_unaligned() and
> put_unaligned() when optimizing for ARMv7.
>
> As always, any code doing unaligned access and _not_ using those
> accessors is broken.
So is there a chance that this patch will be applied for 3.7?
Currently I can't boot anymore Armada XP or Armada 370, if kernel is
compressed in LZO. It's annoying.
Russell, did Nicolas manage to convince you?
If not, what should be the solution to fix this issue?
Thanks,
Gregory
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-11 12:43 ` [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores Rob Herring
2012-10-11 13:09 ` Russell King - ARM Linux
@ 2012-10-25 9:34 ` Johannes Stezenbach
2012-10-25 12:41 ` Rob Herring
1 sibling, 1 reply; 46+ messages in thread
From: Johannes Stezenbach @ 2012-10-25 9:34 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>
> With recent compilers and move to generic unaligned.h in commit d25c881
> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
> by the LZO decompressor on v7 cores. So we need to make sure unaligned
> accesses are allowed by clearing the SCTLR A bit.
I just read this in http://gcc.gnu.org/gcc-4.7/changes.html:
On ARM, when compiling for ARMv6 (but not ARMv6-M), ARMv7-A, ARMv7-R, or
ARMv7-M, the new option -munaligned-access is active by default, which for
some source codes generates code that accesses memory on unaligned addresses.
This will require the kernel of those systems to enable such accesses
(controlled by CP15 register c1, refer to ARM documentation). Alternatively
or for compatibility with kernels where unaligned accesses are not supported,
all code has to be compiled with -mno-unaligned-access. Linux/ARM in official
releases has automatically and unconditionally supported unaligned accesses
as emitted by GCC due to this option being active, since Linux version 2.6.28.
My understanding is that gcc, using the same generic unaligned.h
source code, will generate code for ARMv6 and ARMv7 that uses
unaligned access, while for ARMv5 and older it won't. So it seems
gcc requires Linux to clear SCTLR.A and set SCTLR.U for ARMv6+.
Or add -mno-unaligned-access.
Is my understanding correct?
> While v6 can support unaligned accesses, it is optional and current
> compilers won't emit unaligned accesses. So we don't clear the A bit for
> v6.
not true according to the gcc changes page
Thanks
Johannes
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-23 20:32 ` Gregory CLEMENT
@ 2012-10-25 12:07 ` Russell King - ARM Linux
0 siblings, 0 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2012-10-25 12:07 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 23, 2012 at 10:32:44PM +0200, Gregory CLEMENT wrote:
> So is there a chance that this patch will be applied for 3.7?
>
> Currently I can't boot anymore Armada XP or Armada 370, if kernel is
> compressed in LZO. It's annoying.
>
> Russell, did Nicolas manage to convince you?
Frankly, no - less so now that we have a question mark over ARMv6 which
seems to conflict with Nicolas' justification. This issue is larger than
just ARMv7...
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 9:34 ` Johannes Stezenbach
@ 2012-10-25 12:41 ` Rob Herring
2012-10-25 13:30 ` Arnd Bergmann
2012-10-25 14:16 ` Johannes Stezenbach
0 siblings, 2 replies; 46+ messages in thread
From: Rob Herring @ 2012-10-25 12:41 UTC (permalink / raw)
To: linux-arm-kernel
On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>
>> With recent compilers and move to generic unaligned.h in commit d25c881
>> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
>> by the LZO decompressor on v7 cores. So we need to make sure unaligned
>> accesses are allowed by clearing the SCTLR A bit.
>
> I just read this in http://gcc.gnu.org/gcc-4.7/changes.html:
>
> On ARM, when compiling for ARMv6 (but not ARMv6-M), ARMv7-A, ARMv7-R, or
> ARMv7-M, the new option -munaligned-access is active by default, which for
> some source codes generates code that accesses memory on unaligned addresses.
> This will require the kernel of those systems to enable such accesses
> (controlled by CP15 register c1, refer to ARM documentation). Alternatively
> or for compatibility with kernels where unaligned accesses are not supported,
> all code has to be compiled with -mno-unaligned-access. Linux/ARM in official
> releases has automatically and unconditionally supported unaligned accesses
> as emitted by GCC due to this option being active, since Linux version 2.6.28.
I don't think there is such a thing as ARMv6-M.
> My understanding is that gcc, using the same generic unaligned.h
> source code, will generate code for ARMv6 and ARMv7 that uses
> unaligned access, while for ARMv5 and older it won't. So it seems
> gcc requires Linux to clear SCTLR.A and set SCTLR.U for ARMv6+.
>
> Or add -mno-unaligned-access.
>
> Is my understanding correct?
>
>> While v6 can support unaligned accesses, it is optional and current
>> compilers won't emit unaligned accesses. So we don't clear the A bit for
>> v6.
>
> not true according to the gcc changes page
What are you going to believe: documentation or what the compiler
emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
support backported and 4.7.2, unaligned accesses are emitted for v7
only. I guess default here means it is the default unless you change the
default in your build of gcc.
If we are going to do combined v6k and v7 kernels, then it would be nice
if we could get the compiler to emit unaligned accesses (assuming we
agree we can require that for v6).
Rob
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 12:41 ` Rob Herring
@ 2012-10-25 13:30 ` Arnd Bergmann
2012-10-25 14:16 ` Johannes Stezenbach
1 sibling, 0 replies; 46+ messages in thread
From: Arnd Bergmann @ 2012-10-25 13:30 UTC (permalink / raw)
To: linux-arm-kernel
On Thursday 25 October 2012, Rob Herring wrote:
> > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>
> >> With recent compilers and move to generic unaligned.h in commit d25c881
> >> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
> >> by the LZO decompressor on v7 cores. So we need to make sure unaligned
> >> accesses are allowed by clearing the SCTLR A bit.
> >
> > I just read this in http://gcc.gnu.org/gcc-4.7/changes.html:
> >
> > On ARM, when compiling for ARMv6 (but not ARMv6-M), ARMv7-A, ARMv7-R, or
> > ARMv7-M, the new option -munaligned-access is active by default, which for
> > some source codes generates code that accesses memory on unaligned addresses.
> > This will require the kernel of those systems to enable such accesses
> > (controlled by CP15 register c1, refer to ARM documentation). Alternatively
> > or for compatibility with kernels where unaligned accesses are not supported,
> > all code has to be compiled with -mno-unaligned-access. Linux/ARM in official
> > releases has automatically and unconditionally supported unaligned accesses
> > as emitted by GCC due to this option being active, since Linux version 2.6.28.
>
> I don't think there is such a thing as ARMv6-M.
There is, that would be the Cortex-M0/M0+/M1. You cannot run Linux on these.
Arnd
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 12:41 ` Rob Herring
2012-10-25 13:30 ` Arnd Bergmann
@ 2012-10-25 14:16 ` Johannes Stezenbach
2012-10-25 14:25 ` Rob Herring
1 sibling, 1 reply; 46+ messages in thread
From: Johannes Stezenbach @ 2012-10-25 14:16 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >
> >> While v6 can support unaligned accesses, it is optional and current
> >> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >> v6.
> >
> > not true according to the gcc changes page
>
> What are you going to believe: documentation or what the compiler
> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> support backported and 4.7.2, unaligned accesses are emitted for v7
> only. I guess default here means it is the default unless you change the
> default in your build of gcc.
Since ARMv6 can handle unaligned access in the same way as ARMv7
it seems a clear bug in gcc which might hopefully get fixed.
Thus in this case I think it is reasonable to follow the
gcc documentation, otherwise the code would break for ARMv6
when gcc gets fixed.
Johannes
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 14:16 ` Johannes Stezenbach
@ 2012-10-25 14:25 ` Rob Herring
2012-10-25 15:02 ` Nicolas Pitre
2012-10-25 15:08 ` Johannes Stezenbach
0 siblings, 2 replies; 46+ messages in thread
From: Rob Herring @ 2012-10-25 14:25 UTC (permalink / raw)
To: linux-arm-kernel
On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>>
>>>> While v6 can support unaligned accesses, it is optional and current
>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
>>>> v6.
>>>
>>> not true according to the gcc changes page
>>
>> What are you going to believe: documentation or what the compiler
>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
>> support backported and 4.7.2, unaligned accesses are emitted for v7
>> only. I guess default here means it is the default unless you change the
>> default in your build of gcc.
>
> Since ARMv6 can handle unaligned access in the same way as ARMv7
> it seems a clear bug in gcc which might hopefully get fixed.
> Thus in this case I think it is reasonable to follow the
> gcc documentation, otherwise the code would break for ARMv6
> when gcc gets fixed.
But the compiler can't assume the state of the U bit. I think it is
still legal on v6 to not support unaligned accesses, but on v7 it is
required. All the standard v6 ARM cores support it, but I'm not sure
about custom cores or if there are SOCs with buses that don't support
unaligned accesses properly.
Rob
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 14:25 ` Rob Herring
@ 2012-10-25 15:02 ` Nicolas Pitre
2012-10-25 15:08 ` Johannes Stezenbach
1 sibling, 0 replies; 46+ messages in thread
From: Nicolas Pitre @ 2012-10-25 15:02 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, 25 Oct 2012, Rob Herring wrote:
> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>
> >>>> While v6 can support unaligned accesses, it is optional and current
> >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>> v6.
> >>>
> >>> not true according to the gcc changes page
> >>
> >> What are you going to believe: documentation or what the compiler
> >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >> support backported and 4.7.2, unaligned accesses are emitted for v7
> >> only. I guess default here means it is the default unless you change the
> >> default in your build of gcc.
> >
> > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > it seems a clear bug in gcc which might hopefully get fixed.
> > Thus in this case I think it is reasonable to follow the
> > gcc documentation, otherwise the code would break for ARMv6
> > when gcc gets fixed.
>
> But the compiler can't assume the state of the U bit. I think it is
> still legal on v6 to not support unaligned accesses, but on v7 it is
> required. All the standard v6 ARM cores support it, but I'm not sure
> about custom cores or if there are SOCs with buses that don't support
> unaligned accesses properly.
The fact is that gcc assumes the A bit is cleared when generating code
for ARMv7, period. We need to conform our environment to gcc
expectations, or always compile the decompressor using a lower
architecture level than ARMv7. My preference is the former.
Nicolas
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 14:25 ` Rob Herring
2012-10-25 15:02 ` Nicolas Pitre
@ 2012-10-25 15:08 ` Johannes Stezenbach
2012-11-05 10:48 ` Dave Martin
1 sibling, 1 reply; 46+ messages in thread
From: Johannes Stezenbach @ 2012-10-25 15:08 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>
> >>>> While v6 can support unaligned accesses, it is optional and current
> >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>> v6.
> >>>
> >>> not true according to the gcc changes page
> >>
> >> What are you going to believe: documentation or what the compiler
> >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >> support backported and 4.7.2, unaligned accesses are emitted for v7
> >> only. I guess default here means it is the default unless you change the
> >> default in your build of gcc.
> >
> > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > it seems a clear bug in gcc which might hopefully get fixed.
> > Thus in this case I think it is reasonable to follow the
> > gcc documentation, otherwise the code would break for ARMv6
> > when gcc gets fixed.
>
> But the compiler can't assume the state of the U bit. I think it is
> still legal on v6 to not support unaligned accesses, but on v7 it is
> required. All the standard v6 ARM cores support it, but I'm not sure
> about custom cores or if there are SOCs with buses that don't support
> unaligned accesses properly.
Well, I read the "...since Linux version 2.6.28" comment
in the gcc changes page in the way that they assume the
U-bit is set. (Although I'm not sure it really is???)
Looking at it from another side, if using the hw unaligned
access capability gives a performance benefit then it
would be useful to support it even if gcc doesn't
behave as documented. One could still use a special
unaligned.h for ARMv6 to get the performance benefit.
(If it doesn't give performance benfit, then why
bother, let's just use -mno-unaligned-access for v7, too.)
In the ARMv6 ARM unaligned support and the U-bit is
not optional, so you could use the same SoC bus argument
for some hypothetical v7 SoCs.
Johannes
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-10-25 15:08 ` Johannes Stezenbach
@ 2012-11-05 10:48 ` Dave Martin
2012-11-05 11:13 ` Russell King - ARM Linux
0 siblings, 1 reply; 46+ messages in thread
From: Dave Martin @ 2012-11-05 10:48 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> > On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> > >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > >>>
> > >>>> While v6 can support unaligned accesses, it is optional and current
> > >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> > >>>> v6.
> > >>>
> > >>> not true according to the gcc changes page
> > >>
> > >> What are you going to believe: documentation or what the compiler
> > >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> > >> support backported and 4.7.2, unaligned accesses are emitted for v7
> > >> only. I guess default here means it is the default unless you change the
> > >> default in your build of gcc.
> > >
> > > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > > it seems a clear bug in gcc which might hopefully get fixed.
> > > Thus in this case I think it is reasonable to follow the
> > > gcc documentation, otherwise the code would break for ARMv6
> > > when gcc gets fixed.
> >
> > But the compiler can't assume the state of the U bit. I think it is
> > still legal on v6 to not support unaligned accesses, but on v7 it is
> > required. All the standard v6 ARM cores support it, but I'm not sure
> > about custom cores or if there are SOCs with buses that don't support
> > unaligned accesses properly.
>
> Well, I read the "...since Linux version 2.6.28" comment
> in the gcc changes page in the way that they assume the
> U-bit is set. (Although I'm not sure it really is???)
Actually, the kernel checks the arch version and the U bit on boot,
and chooses the appropriate setting for the A bit depending on the
result. (See arch/arm/mm/alignment.c:alignment_init().)
Currently, we depend on the CPU reset behaviour or firmware/
bootloader to set the U bit for v6, but the behaviour should be
correct either way, though unaligned accesses will obviously
perform (much) better with U=1.
>From the compiler's point of view we have always implemented the
U=1 behaviour, but it has to be done via the alignment fault
handler prior to v6 or with U=0.
> Looking at it from another side, if using the hw unaligned
> access capability gives a performance benefit then it
> would be useful to support it even if gcc doesn't
> behave as documented. One could still use a special
> unaligned.h for ARMv6 to get the performance benefit.
> (If it doesn't give performance benfit, then why
> bother, let's just use -mno-unaligned-access for v7, too.)
For v7, we should definitely use -munaligned-access where available
(unless it's the default?)
For v6, the question is whether there is any legitimate reason
ever to run the kernel with U=0. If not, could we explicitly
set it early and build with -munaligned-access where the compiler
supports this?
The only counterargument I can think of is that some people might
be running some ancient userspace which actually relies on the U=0
behaviour. I don't know whether anyone is actually doing that on
v6, though.
Cheers
---Dave
> In the ARMv6 ARM unaligned support and the U-bit is
> not optional, so you could use the same SoC bus argument
> for some hypothetical v7 SoCs.
>
>
> Johannes
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 10:48 ` Dave Martin
@ 2012-11-05 11:13 ` Russell King - ARM Linux
2012-11-05 13:02 ` Dave Martin
2012-11-05 13:48 ` Rob Herring
0 siblings, 2 replies; 46+ messages in thread
From: Russell King - ARM Linux @ 2012-11-05 11:13 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> > On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> > > On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > > > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> > > >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > > >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > > >>>
> > > >>>> While v6 can support unaligned accesses, it is optional and current
> > > >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> > > >>>> v6.
> > > >>>
> > > >>> not true according to the gcc changes page
> > > >>
> > > >> What are you going to believe: documentation or what the compiler
> > > >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> > > >> support backported and 4.7.2, unaligned accesses are emitted for v7
> > > >> only. I guess default here means it is the default unless you change the
> > > >> default in your build of gcc.
> > > >
> > > > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > > > it seems a clear bug in gcc which might hopefully get fixed.
> > > > Thus in this case I think it is reasonable to follow the
> > > > gcc documentation, otherwise the code would break for ARMv6
> > > > when gcc gets fixed.
> > >
> > > But the compiler can't assume the state of the U bit. I think it is
> > > still legal on v6 to not support unaligned accesses, but on v7 it is
> > > required. All the standard v6 ARM cores support it, but I'm not sure
> > > about custom cores or if there are SOCs with buses that don't support
> > > unaligned accesses properly.
> >
> > Well, I read the "...since Linux version 2.6.28" comment
> > in the gcc changes page in the way that they assume the
> > U-bit is set. (Although I'm not sure it really is???)
>
> Actually, the kernel checks the arch version and the U bit on boot,
> and chooses the appropriate setting for the A bit depending on the
> result. (See arch/arm/mm/alignment.c:alignment_init().)
That is in the kernel itself, _after_ the decompressor has run. It is
not relevant to any discussion about the decompressor.
> Currently, we depend on the CPU reset behaviour or firmware/
> bootloader to set the U bit for v6, but the behaviour should be
> correct either way, though unaligned accesses will obviously
> perform (much) better with U=1.
Will someone _PLEASE_ address my initial comments against this patch
in light of the fact that it's now been proven _NOT_ to be just a V7
issue, rather than everyone seemingly buring their heads in the sand
over this.
The fact is, unaligned accesses in the decompressor are *undefined* at
present.
> For v7, we should definitely use -munaligned-access where available
> (unless it's the default?)
No such option on my compiler - according to the manual I have, the only
option there is starting -munaligned is on SPARC for -munaligned-doubles.
However, I believe GCC does believe that unaligned accesses are fine on
V6 and above.
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 11:13 ` Russell King - ARM Linux
@ 2012-11-05 13:02 ` Dave Martin
2012-11-05 13:43 ` Johannes Stezenbach
2012-11-05 15:39 ` Russell King - ARM Linux
2012-11-05 13:48 ` Rob Herring
1 sibling, 2 replies; 46+ messages in thread
From: Dave Martin @ 2012-11-05 13:02 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 05, 2012 at 11:13:46AM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> > On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> > > On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> > > > On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> > > > > On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> > > > >> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> > > > >>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> > > > >>>
> > > > >>>> While v6 can support unaligned accesses, it is optional and current
> > > > >>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> > > > >>>> v6.
> > > > >>>
> > > > >>> not true according to the gcc changes page
> > > > >>
> > > > >> What are you going to believe: documentation or what the compiler
> > > > >> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> > > > >> support backported and 4.7.2, unaligned accesses are emitted for v7
> > > > >> only. I guess default here means it is the default unless you change the
> > > > >> default in your build of gcc.
> > > > >
> > > > > Since ARMv6 can handle unaligned access in the same way as ARMv7
> > > > > it seems a clear bug in gcc which might hopefully get fixed.
> > > > > Thus in this case I think it is reasonable to follow the
> > > > > gcc documentation, otherwise the code would break for ARMv6
> > > > > when gcc gets fixed.
> > > >
> > > > But the compiler can't assume the state of the U bit. I think it is
> > > > still legal on v6 to not support unaligned accesses, but on v7 it is
> > > > required. All the standard v6 ARM cores support it, but I'm not sure
> > > > about custom cores or if there are SOCs with buses that don't support
> > > > unaligned accesses properly.
> > >
> > > Well, I read the "...since Linux version 2.6.28" comment
> > > in the gcc changes page in the way that they assume the
> > > U-bit is set. (Although I'm not sure it really is???)
> >
> > Actually, the kernel checks the arch version and the U bit on boot,
> > and chooses the appropriate setting for the A bit depending on the
> > result. (See arch/arm/mm/alignment.c:alignment_init().)
>
> That is in the kernel itself, _after_ the decompressor has run. It is
> not relevant to any discussion about the decompressor.
This was merely meant as an argument that the kernel does not make
assumptions about the U bit, and so the zImage decompressor probably
shouldn't either.
> > Currently, we depend on the CPU reset behaviour or firmware/
> > bootloader to set the U bit for v6, but the behaviour should be
> > correct either way, though unaligned accesses will obviously
> > perform (much) better with U=1.
>
> Will someone _PLEASE_ address my initial comments against this patch
> in light of the fact that it's now been proven _NOT_ to be just a V7
> issue, rather than everyone seemingly buring their heads in the sand
> over this.
>
> The fact is, unaligned accesses in the decompressor are *undefined* at
> present.
Why not allow unaligned accesses in the decompressor, though, both
for v6 and v7?
The decompressors run with the cache on, so we should have fault-free
unaligned access on all CPUs which support it, provided that we set the
SCTLR bits appropriately, and provided that the decompressers are
written in correct C.
The counterargument is that zImage would then just fall over if booted
on a CPU older than the baseline architecture the kernel was built for.
However, that should be detected before executing any C code, giving
the zImage code a chance to bail out with a suitable error message.
A kernel which contains support for v6/v7 platforms cannot work on older
CPUs anyway, because the kernel itself doesn't support such combinations.
> > For v7, we should definitely use -munaligned-access where available
> > (unless it's the default?)
>
> No such option on my compiler - according to the manual I have, the only
> option there is starting -munaligned is on SPARC for -munaligned-doubles.
OK, I guess that's something backported into the Linaro toolchain I'm
currently using then. But it seems a good idea to use this if available,
because it allows the compiler to generate better code in some situations,
especially for packed struct access.
> However, I believe GCC does believe that unaligned accesses are fine on
> V6 and above.
Possibly, but I've never seen it use them deliberately, prior to the
-munaligned-access support.
Cheers
---Dave
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 13:02 ` Dave Martin
@ 2012-11-05 13:43 ` Johannes Stezenbach
2012-11-05 15:39 ` Russell King - ARM Linux
1 sibling, 0 replies; 46+ messages in thread
From: Johannes Stezenbach @ 2012-11-05 13:43 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> On Mon, Nov 05, 2012 at 11:13:46AM +0000, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> > > For v7, we should definitely use -munaligned-access where available
> > > (unless it's the default?)
> >
> > No such option on my compiler - according to the manual I have, the only
> > option there is starting -munaligned is on SPARC for -munaligned-doubles.
>
> OK, I guess that's something backported into the Linaro toolchain I'm
> currently using then. But it seems a good idea to use this if available,
> because it allows the compiler to generate better code in some situations,
> especially for packed struct access.
>
> > However, I believe GCC does believe that unaligned accesses are fine on
> > V6 and above.
>
> Possibly, but I've never seen it use them deliberately, prior to the
> -munaligned-access support.
http://gcc.gnu.org/gcc-4.7/changes.html says -munaligned-access is
enabled by default.
I haven't had a chance to try gcc-4.7 yet, but gcc-4.6+linaro
has the option but fails to generate the expected code for ARMv6
while it works for ARMv7. However it does
#define __ARM_FEATURE_UNALIGNED 1
and
.eabi_attribute 34, 1
This seems to be the commit which introduced unaligned support in gcc:
http://repo.or.cz/w/official-gcc.git/commitdiff/eb04cafba3a6f1eddbdb5ec031d8a7074930d5b9
I cannot figure out why this works for v7 but not for v6.
(I used gcc-4.6.4 20121001 (prerelease) toolchain built with crosstool-NG.)
Johannes
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 11:13 ` Russell King - ARM Linux
2012-11-05 13:02 ` Dave Martin
@ 2012-11-05 13:48 ` Rob Herring
2012-11-05 22:02 ` Michael Hope
1 sibling, 1 reply; 46+ messages in thread
From: Rob Herring @ 2012-11-05 13:48 UTC (permalink / raw)
To: linux-arm-kernel
On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
>> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
>>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
>>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
>>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
>>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
>>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
>>>>>>>
>>>>>>>> While v6 can support unaligned accesses, it is optional and current
>>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
>>>>>>>> v6.
>>>>>>>
>>>>>>> not true according to the gcc changes page
>>>>>>
>>>>>> What are you going to believe: documentation or what the compiler
>>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
>>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
>>>>>> only. I guess default here means it is the default unless you change the
>>>>>> default in your build of gcc.
>>>>>
>>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
>>>>> it seems a clear bug in gcc which might hopefully get fixed.
>>>>> Thus in this case I think it is reasonable to follow the
>>>>> gcc documentation, otherwise the code would break for ARMv6
>>>>> when gcc gets fixed.
>>>>
>>>> But the compiler can't assume the state of the U bit. I think it is
>>>> still legal on v6 to not support unaligned accesses, but on v7 it is
>>>> required. All the standard v6 ARM cores support it, but I'm not sure
>>>> about custom cores or if there are SOCs with buses that don't support
>>>> unaligned accesses properly.
>>>
>>> Well, I read the "...since Linux version 2.6.28" comment
>>> in the gcc changes page in the way that they assume the
>>> U-bit is set. (Although I'm not sure it really is???)
>>
>> Actually, the kernel checks the arch version and the U bit on boot,
>> and chooses the appropriate setting for the A bit depending on the
>> result. (See arch/arm/mm/alignment.c:alignment_init().)
>
> That is in the kernel itself, _after_ the decompressor has run. It is
> not relevant to any discussion about the decompressor.
>
>> Currently, we depend on the CPU reset behaviour or firmware/
>> bootloader to set the U bit for v6, but the behaviour should be
>> correct either way, though unaligned accesses will obviously
>> perform (much) better with U=1.
>
> Will someone _PLEASE_ address my initial comments against this patch
> in light of the fact that it's now been proven _NOT_ to be just a V7
> issue, rather than everyone seemingly buring their heads in the sand
> over this.
I tried adding -munaligned-accesses on a v6 build and still get byte
accesses rather than unaligned word accesses. So this does seem to be a
v7 only issue based on what gcc will currently produce. Copying Michael
Hope who can hopefully provide some insight on why v6 unaligned accesses
are not enabled.
> The fact is, unaligned accesses in the decompressor are *undefined* at
> present.
>
>> For v7, we should definitely use -munaligned-access where available
>> (unless it's the default?)
>
> No such option on my compiler - according to the manual I have, the only
> option there is starting -munaligned is on SPARC for -munaligned-doubles.
It's only added in 4.7 and backported to Linaro 4.6.3.
Rob
> However, I believe GCC does believe that unaligned accesses are fine on
> V6 and above.
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 13:02 ` Dave Martin
2012-11-05 13:43 ` Johannes Stezenbach
@ 2012-11-05 15:39 ` Russell King - ARM Linux
2012-11-05 16:13 ` Nicolas Pitre
1 sibling, 1 reply; 46+ messages in thread
From: Russell King - ARM Linux @ 2012-11-05 15:39 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> Why not allow unaligned accesses in the decompressor, though, both
> for v6 and v7?
EXACTLY.
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 15:39 ` Russell King - ARM Linux
@ 2012-11-05 16:13 ` Nicolas Pitre
2012-11-05 17:26 ` Dave Martin
0 siblings, 1 reply; 46+ messages in thread
From: Nicolas Pitre @ 2012-11-05 16:13 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:
> On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> > Why not allow unaligned accesses in the decompressor, though, both
> > for v6 and v7?
>
> EXACTLY.
I have no objections to that. In fact, I made a remark to this effect
in my initial review of this patch. Whether or not gcc does take
advantage of this hardware ability in the end is orthogonal.
Nicolas
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 16:13 ` Nicolas Pitre
@ 2012-11-05 17:26 ` Dave Martin
2012-11-05 17:44 ` Rob Herring
2012-11-05 17:46 ` Nicolas Pitre
0 siblings, 2 replies; 46+ messages in thread
From: Dave Martin @ 2012-11-05 17:26 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 05, 2012 at 11:13:51AM -0500, Nicolas Pitre wrote:
> On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:
>
> > On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> > > Why not allow unaligned accesses in the decompressor, though, both
> > > for v6 and v7?
> >
> > EXACTLY.
>
> I have no objections to that. In fact, I made a remark to this effect
> in my initial review of this patch. Whether or not gcc does take
> advantage of this hardware ability in the end is orthogonal.
For the sake of argument, here's how it might look.
Currently, I make no attempt to restore the original state of the U bit.
The A bit if forced later by the kernel during boot, after a short window
during which we should only run low-level arch code and therefore where
no unaligned accesses should happen.
Does anyone think these issues are likely to be important?
Cheers
---Dave
>From 160a5576b53264951ff8164775146b2d4feddecb Mon Sep 17 00:00:00 2001
From: Dave Martin <dave.martin@linaro.org>
Date: Mon, 5 Nov 2012 16:34:57 +0000
Subject: [PATCH] ARM: decompressor: Enable unaligned memory access for v6 and above
Modern GCC can generate code which makes use of the CPU's native
unaligned memory access capabilities. This is useful for the C
decompressor implementations used for unpacking compressed kernels.
This patch disables the alignment faults and enabled the v6
unaligned access on CPUs which support these features (i.e., v6 and
later), allowing full unaligned access support for C code in the
decompressor.
The decompressor C code must not be built to assume that unaligned
access works if support for v5 or older platforms is included in
the kernel.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
---
Note: I have only build-tested this so far.
arch/arm/boot/compressed/head.S | 12 +++++++++++-
1 files changed, 11 insertions(+), 1 deletions(-)
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 90275f0..cfbb4c9 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -652,6 +652,15 @@ __setup_mmu: sub r3, r4, #16384 @ Page directory size
mov pc, lr
ENDPROC(__setup_mmu)
+@ Enable unaligned access on v6, to allow better code generation
+@ for the decompressor C code:
+__armv6_mmu_cache_on:
+ mrc p15, 0, r0, c1, c0, 0 @ read SCTLR
+ bic r0, r0, #2 @ A (no unaligned access fault)
+ orr r0, r0, #1 << 22 @ U (v6 unaligned access model)
+ mcr p15, 0, r0, c1, c0, 0 @ write SCTLR
+ b __armv4_mmu_cache_on
+
__arm926ejs_mmu_cache_on:
#ifdef CONFIG_CPU_DCACHE_WRITETHROUGH
mov r0, #4 @ put dcache in WT mode
@@ -694,6 +703,7 @@ __armv7_mmu_cache_on:
bic r0, r0, #1 << 28 @ clear SCTLR.TRE
orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
orr r0, r0, #0x003c @ write buffer
+ bic r0, r0, #2 @ A (no unaligned access fault)
#ifdef CONFIG_MMU
#ifdef CONFIG_CPU_ENDIAN_BE8
orr r0, r0, #1 << 25 @ big-endian page tables
@@ -914,7 +924,7 @@ proc_types:
.word 0x0007b000 @ ARMv6
.word 0x000ff000
- W(b) __armv4_mmu_cache_on
+ W(b) __armv6_mmu_cache_on
W(b) __armv4_mmu_cache_off
W(b) __armv6_mmu_cache_flush
--
1.7.4.1
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 17:26 ` Dave Martin
@ 2012-11-05 17:44 ` Rob Herring
2012-11-05 20:02 ` Nicolas Pitre
2012-11-05 17:46 ` Nicolas Pitre
1 sibling, 1 reply; 46+ messages in thread
From: Rob Herring @ 2012-11-05 17:44 UTC (permalink / raw)
To: linux-arm-kernel
On 11/05/2012 11:26 AM, Dave Martin wrote:
> On Mon, Nov 05, 2012 at 11:13:51AM -0500, Nicolas Pitre wrote:
>> On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:
>>
>>> On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
>>>> Why not allow unaligned accesses in the decompressor, though, both
>>>> for v6 and v7?
>>>
>>> EXACTLY.
>>
>> I have no objections to that. In fact, I made a remark to this effect
>> in my initial review of this patch. Whether or not gcc does take
>> advantage of this hardware ability in the end is orthogonal.
>
> For the sake of argument, here's how it might look.
>
> Currently, I make no attempt to restore the original state of the U bit.
> The A bit if forced later by the kernel during boot, after a short window
> during which we should only run low-level arch code and therefore where
> no unaligned accesses should happen.
>
> Does anyone think these issues are likely to be important?
>
And here is my updated version that does v6 somewhat differently:
8<------------------------------------------------------------------
>From 76c2b7685397f13aa53f426822128430fc24b8a0 Mon Sep 17 00:00:00 2001
From: Rob Herring <rob.herring@calxeda.com>
Date: Mon, 5 Nov 2012 11:39:48 -0600
Subject: [PATCH v2] ARM: decompressor: clear SCTLR.A bit for v6 and v7 cores
With recent compilers and move to generic unaligned.h in commit d25c881
(ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
by the LZO decompressor on v7 cores. So we need to make sure unaligned
accesses are allowed by clearing the SCTLR A bit.
While v6 can support unaligned accesses, it is optional and current
compilers won't emit unaligned accesses. In case this changes and to align
with the kernel behavior, we clear the A bit and set the U bit.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Shawn Guo <shawn.guo@linaro.org>
---
arch/arm/boot/compressed/head.S | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index bc67cbf..f14d7ec 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -629,6 +629,11 @@ __armv4_mmu_cache_on:
mcr p15, 0, r0, c7, c10, 4 @ drain write buffer
mcr p15, 0, r0, c8, c7, 0 @ flush I,D TLBs
mrc p15, 0, r0, c1, c0, 0 @ read control reg
+ mrc p15, 0, r11, c0, c0 @ get processor ID
+ and r11, r11, #0xf0000
+ tst r11, #0x70000 @ ARMv6
+ orreq r0, r0, #1 << 22 @ set SCTLR.U
+ biceq r0, r0, #1 << 1 @ clear SCTLR.A
orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
orr r0, r0, #0x0030
#ifdef CONFIG_CPU_ENDIAN_BE8
@@ -654,6 +659,7 @@ __armv7_mmu_cache_on:
#endif
mrc p15, 0, r0, c1, c0, 0 @ read control reg
bic r0, r0, #1 << 28 @ clear SCTLR.TRE
+ bic r0, r0, #1 << 1 @ clear SCTLR.A
orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
orr r0, r0, #0x003c @ write buffer
#ifdef CONFIG_MMU
--
1.7.10.4
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 17:26 ` Dave Martin
2012-11-05 17:44 ` Rob Herring
@ 2012-11-05 17:46 ` Nicolas Pitre
1 sibling, 0 replies; 46+ messages in thread
From: Nicolas Pitre @ 2012-11-05 17:46 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 5 Nov 2012, Dave Martin wrote:
> On Mon, Nov 05, 2012 at 11:13:51AM -0500, Nicolas Pitre wrote:
> > On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:
> >
> > > On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> > > > Why not allow unaligned accesses in the decompressor, though, both
> > > > for v6 and v7?
> > >
> > > EXACTLY.
> >
> > I have no objections to that. In fact, I made a remark to this effect
> > in my initial review of this patch. Whether or not gcc does take
> > advantage of this hardware ability in the end is orthogonal.
>
> For the sake of argument, here's how it might look.
>
> Currently, I make no attempt to restore the original state of the U bit.
> The A bit if forced later by the kernel during boot, after a short window
> during which we should only run low-level arch code and therefore where
> no unaligned accesses should happen.
>
> Does anyone think these issues are likely to be important?
>
> Cheers
> ---Dave
>
>
> >From 160a5576b53264951ff8164775146b2d4feddecb Mon Sep 17 00:00:00 2001
> From: Dave Martin <dave.martin@linaro.org>
> Date: Mon, 5 Nov 2012 16:34:57 +0000
> Subject: [PATCH] ARM: decompressor: Enable unaligned memory access for v6 and above
>
> Modern GCC can generate code which makes use of the CPU's native
> unaligned memory access capabilities. This is useful for the C
> decompressor implementations used for unpacking compressed kernels.
>
> This patch disables the alignment faults and enabled the v6
> unaligned access on CPUs which support these features (i.e., v6 and
> later), allowing full unaligned access support for C code in the
> decompressor.
>
> The decompressor C code must not be built to assume that unaligned
> access works if support for v5 or older platforms is included in
> the kernel.
I'd suggest here that the code must always use the get_unaligned and
put_unaligned accessors when dealing with unaligned pointers, regardless
of this patch. The compiler will optimize the access performed via
those accessors when possible.
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
> ---
> Note: I have only build-tested this so far.
>
> arch/arm/boot/compressed/head.S | 12 +++++++++++-
> 1 files changed, 11 insertions(+), 1 deletions(-)
>
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index 90275f0..cfbb4c9 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -652,6 +652,15 @@ __setup_mmu: sub r3, r4, #16384 @ Page directory size
> mov pc, lr
> ENDPROC(__setup_mmu)
>
> +@ Enable unaligned access on v6, to allow better code generation
> +@ for the decompressor C code:
> +__armv6_mmu_cache_on:
> + mrc p15, 0, r0, c1, c0, 0 @ read SCTLR
> + bic r0, r0, #2 @ A (no unaligned access fault)
> + orr r0, r0, #1 << 22 @ U (v6 unaligned access model)
> + mcr p15, 0, r0, c1, c0, 0 @ write SCTLR
> + b __armv4_mmu_cache_on
> +
> __arm926ejs_mmu_cache_on:
> #ifdef CONFIG_CPU_DCACHE_WRITETHROUGH
> mov r0, #4 @ put dcache in WT mode
> @@ -694,6 +703,7 @@ __armv7_mmu_cache_on:
> bic r0, r0, #1 << 28 @ clear SCTLR.TRE
> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
> orr r0, r0, #0x003c @ write buffer
> + bic r0, r0, #2 @ A (no unaligned access fault)
> #ifdef CONFIG_MMU
> #ifdef CONFIG_CPU_ENDIAN_BE8
> orr r0, r0, #1 << 25 @ big-endian page tables
> @@ -914,7 +924,7 @@ proc_types:
>
> .word 0x0007b000 @ ARMv6
> .word 0x000ff000
> - W(b) __armv4_mmu_cache_on
> + W(b) __armv6_mmu_cache_on
> W(b) __armv4_mmu_cache_off
> W(b) __armv6_mmu_cache_flush
>
> --
> 1.7.4.1
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 17:44 ` Rob Herring
@ 2012-11-05 20:02 ` Nicolas Pitre
0 siblings, 0 replies; 46+ messages in thread
From: Nicolas Pitre @ 2012-11-05 20:02 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, 5 Nov 2012, Rob Herring wrote:
> On 11/05/2012 11:26 AM, Dave Martin wrote:
> > On Mon, Nov 05, 2012 at 11:13:51AM -0500, Nicolas Pitre wrote:
> >> On Mon, 5 Nov 2012, Russell King - ARM Linux wrote:
> >>
> >>> On Mon, Nov 05, 2012 at 01:02:55PM +0000, Dave Martin wrote:
> >>>> Why not allow unaligned accesses in the decompressor, though, both
> >>>> for v6 and v7?
> >>>
> >>> EXACTLY.
> >>
> >> I have no objections to that. In fact, I made a remark to this effect
> >> in my initial review of this patch. Whether or not gcc does take
> >> advantage of this hardware ability in the end is orthogonal.
> >
> > For the sake of argument, here's how it might look.
> >
> > Currently, I make no attempt to restore the original state of the U bit.
> > The A bit if forced later by the kernel during boot, after a short window
> > during which we should only run low-level arch code and therefore where
> > no unaligned accesses should happen.
> >
> > Does anyone think these issues are likely to be important?
> >
>
> And here is my updated version that does v6 somewhat differently:
If I had to choose, I'd prefer Dave's version as being a bit cleaner.
>
> 8<------------------------------------------------------------------
> >From 76c2b7685397f13aa53f426822128430fc24b8a0 Mon Sep 17 00:00:00 2001
> From: Rob Herring <rob.herring@calxeda.com>
> Date: Mon, 5 Nov 2012 11:39:48 -0600
> Subject: [PATCH v2] ARM: decompressor: clear SCTLR.A bit for v6 and v7 cores
>
> With recent compilers and move to generic unaligned.h in commit d25c881
> (ARM: 7493/1: use generic unaligned.h), unaligned accesses will be used
> by the LZO decompressor on v7 cores. So we need to make sure unaligned
> accesses are allowed by clearing the SCTLR A bit.
>
> While v6 can support unaligned accesses, it is optional and current
> compilers won't emit unaligned accesses. In case this changes and to align
> with the kernel behavior, we clear the A bit and set the U bit.
>
> Signed-off-by: Rob Herring <rob.herring@calxeda.com>
> Acked-by: Nicolas Pitre <nico@linaro.org>
> Tested-by: Shawn Guo <shawn.guo@linaro.org>
> ---
> arch/arm/boot/compressed/head.S | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index bc67cbf..f14d7ec 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -629,6 +629,11 @@ __armv4_mmu_cache_on:
> mcr p15, 0, r0, c7, c10, 4 @ drain write buffer
> mcr p15, 0, r0, c8, c7, 0 @ flush I,D TLBs
> mrc p15, 0, r0, c1, c0, 0 @ read control reg
> + mrc p15, 0, r11, c0, c0 @ get processor ID
> + and r11, r11, #0xf0000
> + tst r11, #0x70000 @ ARMv6
> + orreq r0, r0, #1 << 22 @ set SCTLR.U
> + biceq r0, r0, #1 << 1 @ clear SCTLR.A
> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
> orr r0, r0, #0x0030
> #ifdef CONFIG_CPU_ENDIAN_BE8
> @@ -654,6 +659,7 @@ __armv7_mmu_cache_on:
> #endif
> mrc p15, 0, r0, c1, c0, 0 @ read control reg
> bic r0, r0, #1 << 28 @ clear SCTLR.TRE
> + bic r0, r0, #1 << 1 @ clear SCTLR.A
> orr r0, r0, #0x5000 @ I-cache enable, RR cache replacement
> orr r0, r0, #0x003c @ write buffer
> #ifdef CONFIG_MMU
> --
> 1.7.10.4
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 13:48 ` Rob Herring
@ 2012-11-05 22:02 ` Michael Hope
2013-02-20 14:56 ` Johannes Stezenbach
0 siblings, 1 reply; 46+ messages in thread
From: Michael Hope @ 2012-11-05 22:02 UTC (permalink / raw)
To: linux-arm-kernel
On 6 November 2012 02:48, Rob Herring <robherring2@gmail.com> wrote:
>
> On 11/05/2012 05:13 AM, Russell King - ARM Linux wrote:
> > On Mon, Nov 05, 2012 at 10:48:50AM +0000, Dave Martin wrote:
> >> On Thu, Oct 25, 2012 at 05:08:16PM +0200, Johannes Stezenbach wrote:
> >>> On Thu, Oct 25, 2012 at 09:25:06AM -0500, Rob Herring wrote:
> >>>> On 10/25/2012 09:16 AM, Johannes Stezenbach wrote:
> >>>>> On Thu, Oct 25, 2012 at 07:41:45AM -0500, Rob Herring wrote:
> >>>>>> On 10/25/2012 04:34 AM, Johannes Stezenbach wrote:
> >>>>>>> On Thu, Oct 11, 2012 at 07:43:22AM -0500, Rob Herring wrote:
> >>>>>>>
> >>>>>>>> While v6 can support unaligned accesses, it is optional and current
> >>>>>>>> compilers won't emit unaligned accesses. So we don't clear the A bit for
> >>>>>>>> v6.
> >>>>>>>
> >>>>>>> not true according to the gcc changes page
> >>>>>>
> >>>>>> What are you going to believe: documentation or what the compiler
> >>>>>> emitted? At least for ubuntu/linaro 4.6.3 which has the unaligned access
> >>>>>> support backported and 4.7.2, unaligned accesses are emitted for v7
> >>>>>> only. I guess default here means it is the default unless you change the
> >>>>>> default in your build of gcc.
> >>>>>
> >>>>> Since ARMv6 can handle unaligned access in the same way as ARMv7
> >>>>> it seems a clear bug in gcc which might hopefully get fixed.
> >>>>> Thus in this case I think it is reasonable to follow the
> >>>>> gcc documentation, otherwise the code would break for ARMv6
> >>>>> when gcc gets fixed.
> >>>>
> >>>> But the compiler can't assume the state of the U bit. I think it is
> >>>> still legal on v6 to not support unaligned accesses, but on v7 it is
> >>>> required. All the standard v6 ARM cores support it, but I'm not sure
> >>>> about custom cores or if there are SOCs with buses that don't support
> >>>> unaligned accesses properly.
> >>>
> >>> Well, I read the "...since Linux version 2.6.28" comment
> >>> in the gcc changes page in the way that they assume the
> >>> U-bit is set. (Although I'm not sure it really is???)
> >>
> >> Actually, the kernel checks the arch version and the U bit on boot,
> >> and chooses the appropriate setting for the A bit depending on the
> >> result. (See arch/arm/mm/alignment.c:alignment_init().)
> >
> > That is in the kernel itself, _after_ the decompressor has run. It is
> > not relevant to any discussion about the decompressor.
> >
> >> Currently, we depend on the CPU reset behaviour or firmware/
> >> bootloader to set the U bit for v6, but the behaviour should be
> >> correct either way, though unaligned accesses will obviously
> >> perform (much) better with U=1.
> >
> > Will someone _PLEASE_ address my initial comments against this patch
> > in light of the fact that it's now been proven _NOT_ to be just a V7
> > issue, rather than everyone seemingly buring their heads in the sand
> > over this.
>
> I tried adding -munaligned-accesses on a v6 build and still get byte
> accesses rather than unaligned word accesses. So this does seem to be a
> v7 only issue based on what gcc will currently produce. Copying Michael
> Hope who can hopefully provide some insight on why v6 unaligned accesses
> are not enabled.
This looks like a bug. Unaligned access is enabled for armv6 but
seems to only take effect for cores with Thumb-2. Here's a test case
both with unaligned field access and unaligned block copy:
struct foo
{
char a;
int b;
struct
{
int x[3];
} c;
} __attribute__((packed));
int get_field(struct foo *p)
{
return p->b;
}
int copy_block(struct foo *p, struct foo *q)
{
p->c = q->c;
}
With -march=armv7-a you get the correct:
bar:
ldr r0, [r0, #1] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
bx lr @ 21 *arm_return [length = 12]
baz:
str r4, [sp, #-4]! @ 25 *push_multi [length = 4]
mov r2, r0 @ 2 *arm_movsi_vfp/1 [length = 4]
ldr r4, [r1, #5]! @ unaligned @ 9 unaligned_loadsi/2 [length = 4]
ldr ip, [r1, #4] @ unaligned @ 10 unaligned_loadsi/2 [length = 4]
ldr r1, [r1, #8] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
str r4, [r2, #5] @ unaligned @ 12 unaligned_storesi/2 [length = 4]
str ip, [r2, #9] @ unaligned @ 13 unaligned_storesi/2 [length = 4]
str r1, [r2, #13] @ unaligned @ 14 unaligned_storesi/2 [length = 4]
ldmfd sp!, {r4}
bx lr
With -march=armv6 you get a byte-by-byte field access and a correct
unaligned block copy:
bar:
ldrb r1, [r0, #2] @ zero_extendqisi2
ldrb r3, [r0, #1] @ zero_extendqisi2
ldrb r2, [r0, #3] @ zero_extendqisi2
ldrb r0, [r0, #4] @ zero_extendqisi2
orr r3, r3, r1, asl #8
orr r3, r3, r2, asl #16
orr r0, r3, r0, asl #24
bx lr
baz:
str r4, [sp, #-4]!
mov r2, r0
ldr r4, [r1, #5]! @ unaligned
ldr ip, [r1, #4] @ unaligned
ldr r1, [r1, #8] @ unaligned
str r4, [r2, #5] @ unaligned
str ip, [r2, #9] @ unaligned
str r1, [r2, #13] @ unaligned
ldmfd sp!, {r4}
bx lr
readelf -A shows that the compiler planned to use unaligned access in
both. My suspicion is that GCC is using the extv pattern to extract
the field from memory, and that pattern is only enabled for Thumb-2
capable cores.
I've logged PR55218. We'll discuss it at our next meeting.
-- Michael
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2012-11-05 22:02 ` Michael Hope
@ 2013-02-20 14:56 ` Johannes Stezenbach
2013-02-20 15:07 ` Johannes Stezenbach
0 siblings, 1 reply; 46+ messages in thread
From: Johannes Stezenbach @ 2013-02-20 14:56 UTC (permalink / raw)
To: linux-arm-kernel
Replying to old thread, for full context see
http://thread.gmane.org/gmane.linux.ports.arm.kernel/180226/focus=197914
On Tue, Nov 06, 2012 at 11:02:27AM +1300, Michael Hope wrote:
> On 6 November 2012 02:48, Rob Herring <robherring2@gmail.com> wrote:
> >
> > I tried adding -munaligned-accesses on a v6 build and still get byte
> > accesses rather than unaligned word accesses. So this does seem to be a
> > v7 only issue based on what gcc will currently produce. Copying Michael
> > Hope who can hopefully provide some insight on why v6 unaligned accesses
> > are not enabled.
>
> This looks like a bug. Unaligned access is enabled for armv6 but
> seems to only take effect for cores with Thumb-2. Here's a test case
> both with unaligned field access and unaligned block copy:
>
> struct foo
> {
> char a;
> int b;
> struct
> {
> int x[3];
> } c;
> } __attribute__((packed));
>
> int get_field(struct foo *p)
> {
> return p->b;
> }
>
> int copy_block(struct foo *p, struct foo *q)
> {
> p->c = q->c;
> }
>
> With -march=armv7-a you get the correct:
>
> bar:
> ldr r0, [r0, #1] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
> bx lr @ 21 *arm_return [length = 12]
>
> baz:
> str r4, [sp, #-4]! @ 25 *push_multi [length = 4]
> mov r2, r0 @ 2 *arm_movsi_vfp/1 [length = 4]
> ldr r4, [r1, #5]! @ unaligned @ 9 unaligned_loadsi/2 [length = 4]
> ldr ip, [r1, #4] @ unaligned @ 10 unaligned_loadsi/2 [length = 4]
> ldr r1, [r1, #8] @ unaligned @ 11 unaligned_loadsi/2 [length = 4]
> str r4, [r2, #5] @ unaligned @ 12 unaligned_storesi/2 [length = 4]
> str ip, [r2, #9] @ unaligned @ 13 unaligned_storesi/2 [length = 4]
> str r1, [r2, #13] @ unaligned @ 14 unaligned_storesi/2 [length = 4]
> ldmfd sp!, {r4}
> bx lr
>
> With -march=armv6 you get a byte-by-byte field access and a correct
> unaligned block copy:
>
> bar:
> ldrb r1, [r0, #2] @ zero_extendqisi2
> ldrb r3, [r0, #1] @ zero_extendqisi2
> ldrb r2, [r0, #3] @ zero_extendqisi2
> ldrb r0, [r0, #4] @ zero_extendqisi2
> orr r3, r3, r1, asl #8
> orr r3, r3, r2, asl #16
> orr r0, r3, r0, asl #24
> bx lr
>
> baz:
> str r4, [sp, #-4]!
> mov r2, r0
> ldr r4, [r1, #5]! @ unaligned
> ldr ip, [r1, #4] @ unaligned
> ldr r1, [r1, #8] @ unaligned
> str r4, [r2, #5] @ unaligned
> str ip, [r2, #9] @ unaligned
> str r1, [r2, #13] @ unaligned
> ldmfd sp!, {r4}
> bx lr
>
> readelf -A shows that the compiler planned to use unaligned access in
> both. My suspicion is that GCC is using the extv pattern to extract
> the field from memory, and that pattern is only enabled for Thumb-2
> capable cores.
>
> I've logged PR55218. We'll discuss it at our next meeting.
Just tried with gcc-linaro-4.7-2013.01 (gcc-4.7.3 20130102 (prerelease)),
the issue is still unfixed. Do you have any idea how to fix it?
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55218
Thanks,
Johannes
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores
2013-02-20 14:56 ` Johannes Stezenbach
@ 2013-02-20 15:07 ` Johannes Stezenbach
0 siblings, 0 replies; 46+ messages in thread
From: Johannes Stezenbach @ 2013-02-20 15:07 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Feb 20, 2013 at 03:56:54PM +0100, Johannes Stezenbach wrote:
> Replying to old thread, for full context see
> http://thread.gmane.org/gmane.linux.ports.arm.kernel/180226/focus=197914
>
> On Tue, Nov 06, 2012 at 11:02:27AM +1300, Michael Hope wrote:
Oh, mail to michael.hope at linaro.org bounces (account disabled).
That might explain why the bug is not fixed :-(
Johannes
^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2013-02-20 15:07 UTC | newest]
Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-05 1:23 [PATCH v2 0/5] Use more asm-generic headers Rob Herring
2012-08-05 1:23 ` [PATCH v2 1/5] ARM: use generic version of identical asm headers Rob Herring
2012-08-05 1:23 ` [PATCH v2 2/5] ARM: add strstr declaration for decompressors Rob Herring
2012-08-05 1:23 ` [PATCH v2 3/5] ARM: use generic unaligned.h Rob Herring
2012-10-08 16:43 ` Shawn Guo
2012-10-08 20:34 ` Rob Herring
2012-10-08 23:28 ` Shawn Guo
2012-10-09 2:27 ` Rob Herring
2012-10-09 2:45 ` Shawn Guo
2012-10-09 4:01 ` Nicolas Pitre
2012-10-10 13:29 ` Rob Herring
2012-10-10 13:57 ` Nicolas Pitre
2012-10-11 12:43 ` [PATCH] ARM: decompressor: clear SCTLR.A bit for v7 cores Rob Herring
2012-10-11 13:09 ` Russell King - ARM Linux
2012-10-11 13:31 ` Rob Herring
2012-10-11 13:41 ` Russell King - ARM Linux
2012-10-11 15:44 ` Nicolas Pitre
2012-10-23 20:32 ` Gregory CLEMENT
2012-10-25 12:07 ` Russell King - ARM Linux
2012-10-11 13:59 ` Russell King - ARM Linux
2012-10-11 15:58 ` Nicolas Pitre
2012-10-25 9:34 ` Johannes Stezenbach
2012-10-25 12:41 ` Rob Herring
2012-10-25 13:30 ` Arnd Bergmann
2012-10-25 14:16 ` Johannes Stezenbach
2012-10-25 14:25 ` Rob Herring
2012-10-25 15:02 ` Nicolas Pitre
2012-10-25 15:08 ` Johannes Stezenbach
2012-11-05 10:48 ` Dave Martin
2012-11-05 11:13 ` Russell King - ARM Linux
2012-11-05 13:02 ` Dave Martin
2012-11-05 13:43 ` Johannes Stezenbach
2012-11-05 15:39 ` Russell King - ARM Linux
2012-11-05 16:13 ` Nicolas Pitre
2012-11-05 17:26 ` Dave Martin
2012-11-05 17:44 ` Rob Herring
2012-11-05 20:02 ` Nicolas Pitre
2012-11-05 17:46 ` Nicolas Pitre
2012-11-05 13:48 ` Rob Herring
2012-11-05 22:02 ` Michael Hope
2013-02-20 14:56 ` Johannes Stezenbach
2013-02-20 15:07 ` Johannes Stezenbach
2012-08-05 1:23 ` [PATCH v2 4/5] ARM: use generic termios.h Rob Herring
2012-08-05 1:24 ` [PATCH v2 5/5] ARM: convert asm/irqflags.h to use asm-generic/irqflags.h Rob Herring
2012-08-05 9:34 ` [PATCH v2 0/5] Use more asm-generic headers Thomas Petazzoni
2012-08-06 14:35 ` Arnd Bergmann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.