Linux-arch Archive on lore.kernel.org
 help / color / Atom feed
* [RFC][CFT][PATCHSET] saner calling conventions for csum-and-copy primitives
@ 2020-07-21 20:24 Al Viro
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
  2020-07-24  1:25 ` [RFC][CFT][PATCHSET v2] saner calling conventions for csum-and-copy primitives Al Viro
  0 siblings, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-arch, linux-kernel

	We have 3 per-architecture primitives that copy data and return
the checksum.  One is csum_partial_copy_nocheck() (kernel-to-kernel),
other two - csum_and_copy_from_user() and csum_and_copy_to_user().

	There are default implementations, but for quite a few architectures
these are done in assembler of varying unpleasantness.  The calling conventions
are due to a large pile of historical accidents; right now they are

__wsum csum_partial_copy_nocheck(src, dst, len, initial_sum):
	copy len bytes of data from src to dst, return something that is
comparable mod 65535 with checksum of that data added to initial_sum.
As always, __wsum values are defined only modulo 65535 - different kernel
configs can yield different 32bit values on the same data, identical blocks
of data at different address can yield different 32bit values (on the same
kernel), etc.  Relatively few call sites.

__wsum csum_and_copy_from_user(src, dst, len, initial_sum, errp)
	copy len bytes of data from src (in userspace) to dst.  In case
we can't copy the entire thing, set *errp to -EFAULT.  Otherwise *errp
is left unmodified and we return something that is comparable mod 65535
with the checksum of that data added to initial_sum.  Only two call sites
(both in lib/iov_iter.c).  In case of an error, the copied data is
discarded, along with the return value.

__wsum csum_and_copy_to_user(src, dst, len, initial_sum, errp)
	copy len bytes of data from src to dst (in userspace).  In case
we can't copy the entire thing, set *errp to -EFAULT.  Otherwise *errp
is left unmodified and we return something that is comparable mod 65535
with the checksum of that data added to initial_sum.  Only one call site
(in lib/iov_iter.c).  In case of an error, the copied data is
discarded, along with the return value.

The guts of these primitives are at the very least similar to each other;
on architectures with common address space for kernel and userland all
three are often implemented via a single asm helper.  Unfortunately, the
exception handlers are overcomplicated; if nothing else, they need to
pass a pointer to store -EFAULT into and some instances are trying to
zero the rest of destination (or the entire destination) when we fail to
fetch some data.

Note, BTW, that "userspace" in the above is real userspace - we never have
csum_and_copy_..._user() called under KERNEL_DS.

It's far too convoluted and that code had been accumulated a lot of cruft
over the years - for example, I'm fairly certain that nobody has read amd64
instances through since about 2005.

It's not that hard to untangle, though.  First of all, initial_sum part is
pointless - there is only one caller that ever passes something other than
zero and that one is easy to massage so that it, too, would pass zero (it
sums and copies the fragments attached to skb, then does the same to the
main part; doing the main part first gets rid of the problem).

Furthermore, using 0xffffffff instead of 0 will yield a non-zero value
comparable mod 0xffff with the original one, due to the way csum_add() works;
the same goes for its open-coded asm equivalents, as well as various "folding"
primitives.

That allows to use a simpler method of reporting an error - simply return
zero.  I.e.
__wsum csum_partial_copy_nocheck(src, dst, len)
	copy len bytes of data from src to dst, return something that is
comparable mod 65535 with checksum of that data.

__wsum csum_and_copy_from_user(src, dst, len)
	copy len bytes of data from src (in userspace) to dst.  In case
we can't copy the entire thing, return 0.  Otherwise return something non-zero
that is comparable mod 65535 with the checksum of that data.

__wsum csum_and_copy_to_user(src, dst, len)
	copy len bytes of data from src to dst (in userspace).  In case
we can't copy the entire thing, return 0.  Otherwise return something non-zero
that is comparable mod 65535 with the checksum of that data.

Exception handlers become trivial that way, of course, and quite a few gross
hacks in them go away.

The branch is based at 5.8-rc1 and can be found in
git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.csum_and_copy
Individual patches will go in followups.

First we get rid of passing the initial sum for csum_partial_copy_nocheck():
      skb_copy_and_csum_bits(): don't bother with the last argument
      icmp_push_reply(): reorder adding the checksum up
      csum_partial_copy_nocheck(): drop the last argument
Next comes the minimal conversion of csum_and_copy_..._user() to new calling
conventions.  Asm parts are left unchanged at that point, which makes for
a reasonably small patch.  We could split that into per-architecture parts,
but IMO it's better done that way - no need of temporary "has this architecture
already switched" config symbols, etc.
      csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
      saner calling conventions for csum_and_copy_..._user()
Finally, we propagate the change of calling conventions down into the asm
helpers.  That's done on per-architecture basis (and only for the architectures
that are not using the default instances, of course).
      alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
      arm: propagate the calling convention changes down to csum_partial_copy_from_user()
      m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
      sh: propage the calling conventions change down to csum_partial_copy_generic()
      i386: propagate the calling conventions change down to csum_partial_copy_generic()
      sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
      mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
      mips: __csum_partial_copy_kernel() has no users left
      mips: propagate the calling convention change down into __csum_partial_copy_..._user()
      xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
      sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
      amd64: switch csum_partial_copy_generic() to new calling conventions
      ppc: propagate the calling conventions change down to csum_partial_copy_generic()

Only lightly tested; a lot of asm shite is killed off, so I would really like the
architecture maintainers to look through that thing.  Please, review.

Diffstat:
 arch/alpha/include/asm/checksum.h         |   4 +-
 arch/alpha/lib/csum_partial_copy.c        | 164 ++++++++-----------
 arch/arm/include/asm/checksum.h           |  16 +-
 arch/arm/lib/csumpartialcopy.S            |   4 +-
 arch/arm/lib/csumpartialcopygeneric.S     |   1 +
 arch/arm/lib/csumpartialcopyuser.S        |  26 +--
 arch/hexagon/include/asm/checksum.h       |   3 +-
 arch/hexagon/lib/checksum.c               |   4 +-
 arch/ia64/include/asm/checksum.h          |   3 +-
 arch/ia64/lib/csum_partial_copy.c         |   4 +-
 arch/m68k/include/asm/checksum.h          |   6 +-
 arch/m68k/lib/checksum.c                  |  88 +++-------
 arch/mips/include/asm/checksum.h          |  66 ++------
 arch/mips/lib/csum_partial.S              | 261 ++++++++++--------------------
 arch/nios2/include/asm/checksum.h         |   4 +-
 arch/parisc/include/asm/checksum.h        |  22 +--
 arch/parisc/lib/checksum.c                |   7 +-
 arch/powerpc/include/asm/checksum.h       |  12 +-
 arch/powerpc/lib/checksum_32.S            |  74 ++++-----
 arch/powerpc/lib/checksum_64.S            |  37 ++---
 arch/powerpc/lib/checksum_wrappers.c      |  74 ++-------
 arch/s390/include/asm/checksum.h          |   4 +-
 arch/sh/include/asm/checksum_32.h         |  35 ++--
 arch/sh/lib/checksum.S                    | 119 ++++----------
 arch/sparc/include/asm/checksum.h         |   1 +
 arch/sparc/include/asm/checksum_32.h      |  70 ++------
 arch/sparc/include/asm/checksum_64.h      |  39 +----
 arch/sparc/lib/checksum_32.S              | 202 +++++------------------
 arch/sparc/lib/csum_copy.S                |   3 +-
 arch/sparc/lib/csum_copy_from_user.S      |   4 +-
 arch/sparc/lib/csum_copy_to_user.S        |   4 +-
 arch/sparc/mm/fault_32.c                  |   6 +-
 arch/x86/include/asm/checksum_32.h        |  40 ++---
 arch/x86/include/asm/checksum_64.h        |  14 +-
 arch/x86/lib/checksum_32.S                | 117 +++++---------
 arch/x86/lib/csum-copy_64.S               | 140 +++++++++-------
 arch/x86/lib/csum-wrappers_64.c           |  86 ++--------
 arch/x86/um/asm/checksum.h                |   5 +-
 arch/x86/um/asm/checksum_32.h             |  23 ---
 arch/xtensa/include/asm/checksum.h        |  33 ++--
 arch/xtensa/lib/checksum.S                |  67 ++------
 drivers/net/ethernet/3com/typhoon.c       |   3 +-
 drivers/net/ethernet/sun/sunvnet_common.c |   2 +-
 include/asm-generic/checksum.h            |   4 +-
 include/linux/skbuff.h                    |   2 +-
 include/net/checksum.h                    |  15 +-
 lib/iov_iter.c                            |  21 ++-
 net/core/skbuff.c                         |  13 +-
 net/ipv4/icmp.c                           |  10 +-
 net/ipv4/ip_output.c                      |   6 +-
 net/ipv4/raw.c                            |   2 +-
 net/ipv6/icmp.c                           |   4 +-
 net/ipv6/ip6_output.c                     |   2 +-
 net/ipv6/raw.c                            |   2 +-
 net/sunrpc/socklib.c                      |   2 +-
 55 files changed, 609 insertions(+), 1371 deletions(-)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument
  2020-07-21 20:24 [RFC][CFT][PATCHSET] saner calling conventions for csum-and-copy primitives Al Viro
@ 2020-07-21 20:25 ` Al Viro
  2020-07-21 20:25   ` [PATCH 02/18] icmp_push_reply(): reorder adding the checksum up Al Viro
                     ` (16 more replies)
  2020-07-24  1:25 ` [RFC][CFT][PATCHSET v2] saner calling conventions for csum-and-copy primitives Al Viro
  1 sibling, 17 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

it's always 0

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/ethernet/sun/sunvnet_common.c |  2 +-
 include/linux/skbuff.h                    |  2 +-
 net/core/skbuff.c                         | 11 ++++++-----
 net/ipv4/icmp.c                           |  2 +-
 net/ipv4/ip_output.c                      |  4 ++--
 net/ipv6/icmp.c                           |  4 ++--
 net/ipv6/ip6_output.c                     |  2 +-
 net/sunrpc/socklib.c                      |  2 +-
 8 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet_common.c b/drivers/net/ethernet/sun/sunvnet_common.c
index 8dc6c9ff22e1..80fde5f06fce 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -1168,7 +1168,7 @@ static inline struct sk_buff *vnet_skb_shape(struct sk_buff *skb, int ncookies)
 			*(__sum16 *)(skb->data + offset) = 0;
 			csum = skb_copy_and_csum_bits(skb, start,
 						      nskb->data + start,
-						      skb->len - start, 0);
+						      skb->len - start);
 
 			/* add in the header checksums */
 			if (skb->protocol == htons(ETH_P_IP)) {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0c0377fc00c2..1dcd255c9a03 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3529,7 +3529,7 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags);
 int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len);
 int skb_store_bits(struct sk_buff *skb, int offset, const void *from, int len);
 __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, u8 *to,
-			      int len, __wsum csum);
+			      int len);
 int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset,
 		    struct pipe_inode_info *pipe, unsigned int len,
 		    unsigned int flags);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b8afefe6f6b6..9c0918651445 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2723,19 +2723,20 @@ EXPORT_SYMBOL(skb_checksum);
 /* Both of above in one bottle. */
 
 __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
-				    u8 *to, int len, __wsum csum)
+				    u8 *to, int len)
 {
 	int start = skb_headlen(skb);
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 	int pos = 0;
+	__wsum csum = 0;
 
 	/* Copy header. */
 	if (copy > 0) {
 		if (copy > len)
 			copy = len;
 		csum = csum_partial_copy_nocheck(skb->data + offset, to,
-						 copy, csum);
+						 copy, 0);
 		if ((len -= copy) == 0)
 			return csum;
 		offset += copy;
@@ -2791,7 +2792,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 				copy = len;
 			csum2 = skb_copy_and_csum_bits(frag_iter,
 						       offset - start,
-						       to, copy, 0);
+						       to, copy);
 			csum = csum_block_add(csum, csum2, pos);
 			if ((len -= copy) == 0)
 				return csum;
@@ -3011,7 +3012,7 @@ void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to)
 	csum = 0;
 	if (csstart != skb->len)
 		csum = skb_copy_and_csum_bits(skb, csstart, to + csstart,
-					      skb->len - csstart, 0);
+					      skb->len - csstart);
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		long csstuff = csstart + skb->csum_offset;
@@ -3933,7 +3934,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 					skb_copy_and_csum_bits(head_skb, offset,
 							       skb_put(nskb,
 								       len),
-							       len, 0);
+							       len);
 				SKB_GSO_CB(nskb)->csum_start =
 					skb_headroom(nskb) + doffset;
 			} else {
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 956a806649f7..62d7a2bfc9a3 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -352,7 +352,7 @@ static int icmp_glue_bits(void *from, char *to, int offset, int len, int odd,
 
 	csum = skb_copy_and_csum_bits(icmp_param->skb,
 				      icmp_param->offset + offset,
-				      to, len, 0);
+				      to, len);
 
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	if (icmp_pointers[icmp_param->data.icmph.type].error)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 090d3097ee15..7fd164754519 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1120,7 +1120,7 @@ static int __ip_append_data(struct sock *sk,
 			if (fraggap) {
 				skb->csum = skb_copy_and_csum_bits(
 					skb_prev, maxfraglen,
-					data + transhdrlen, fraggap, 0);
+					data + transhdrlen, fraggap);
 				skb_prev->csum = csum_sub(skb_prev->csum,
 							  skb->csum);
 				data += fraggap;
@@ -1405,7 +1405,7 @@ ssize_t	ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
 				skb->csum = skb_copy_and_csum_bits(skb_prev,
 								   maxfraglen,
 						    skb_transport_header(skb),
-								   fraggap, 0);
+								   fraggap);
 				skb_prev->csum = csum_sub(skb_prev->csum,
 							  skb->csum);
 				pskb_trim_unique(skb_prev, maxfraglen);
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index fc5000370030..2ae42b4e0c1a 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -314,10 +314,10 @@ static int icmpv6_getfrag(void *from, char *to, int offset, int len, int odd, st
 {
 	struct icmpv6_msg *msg = (struct icmpv6_msg *) from;
 	struct sk_buff *org_skb = msg->skb;
-	__wsum csum = 0;
+	__wsum csum;
 
 	csum = skb_copy_and_csum_bits(org_skb, msg->offset + offset,
-				      to, len, csum);
+				      to, len);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	if (!(msg->type & ICMPV6_INFOMSG_MASK))
 		nf_ct_attach(skb, org_skb);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 8a8c2d0cfcc8..bf9367c2504b 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1613,7 +1613,7 @@ static int __ip6_append_data(struct sock *sk,
 			if (fraggap) {
 				skb->csum = skb_copy_and_csum_bits(
 					skb_prev, maxfraglen,
-					data + transhdrlen, fraggap, 0);
+					data + transhdrlen, fraggap);
 				skb_prev->csum = csum_sub(skb_prev->csum,
 							  skb->csum);
 				data += fraggap;
diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
index 3fc8af8bb961..d52313af82bc 100644
--- a/net/sunrpc/socklib.c
+++ b/net/sunrpc/socklib.c
@@ -70,7 +70,7 @@ static size_t xdr_skb_read_and_csum_bits(struct xdr_skb_reader *desc, void *to,
 	if (len > desc->count)
 		len = desc->count;
 	pos = desc->offset;
-	csum2 = skb_copy_and_csum_bits(desc->skb, pos, to, len, 0);
+	csum2 = skb_copy_and_csum_bits(desc->skb, pos, to, len);
 	desc->csum = csum_block_add(desc->csum, csum2, pos);
 	desc->count -= len;
 	desc->offset += len;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 02/18] icmp_push_reply(): reorder adding the checksum up
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25   ` [PATCH 03/18] csum_partial_copy_nocheck(): drop the last argument Al Viro
                     ` (15 subsequent siblings)
  16 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

do csum_partial_copy_nocheck() on the first fragment, then
add the rest to it.  Equivalent transformation.

That was the only caller of csum_partial_copy_nocheck() that
might pass it non-zero as the last argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv4/icmp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 62d7a2bfc9a3..f93317157549 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -376,15 +376,15 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 		ip_flush_pending_frames(sk);
 	} else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) {
 		struct icmphdr *icmph = icmp_hdr(skb);
-		__wsum csum = 0;
+		__wsum csum;
 		struct sk_buff *skb1;
 
+		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
+						 (char *)icmph,
+						 icmp_param->head_len, 0);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
-		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
-						 (char *)icmph,
-						 icmp_param->head_len, csum);
 		icmph->checksum = csum_fold(csum);
 		skb->ip_summed = CHECKSUM_NONE;
 		ip_push_pending_frames(sk, fl4);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 03/18] csum_partial_copy_nocheck(): drop the last argument
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
  2020-07-21 20:25   ` [PATCH 02/18] icmp_push_reply(): reorder adding the checksum up Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
                     ` (14 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

It's always 0.  Note that we could use ~0U as well - result
will be the same modulo 0xffff; later we'll make use of that
whenever convenient.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    | 2 +-
 arch/alpha/lib/csum_partial_copy.c   | 4 ++--
 arch/arm/include/asm/checksum.h      | 2 +-
 arch/arm/lib/csumpartialcopy.S       | 5 +++--
 arch/hexagon/include/asm/checksum.h  | 3 +--
 arch/hexagon/lib/checksum.c          | 4 ++--
 arch/ia64/include/asm/checksum.h     | 3 +--
 arch/ia64/lib/csum_partial_copy.c    | 4 ++--
 arch/m68k/include/asm/checksum.h     | 3 +--
 arch/m68k/lib/checksum.c             | 3 ++-
 arch/mips/include/asm/checksum.h     | 7 +++++--
 arch/mips/lib/csum_partial.S         | 4 ++--
 arch/nios2/include/asm/checksum.h    | 4 ++--
 arch/parisc/include/asm/checksum.h   | 2 +-
 arch/parisc/lib/checksum.c           | 7 ++-----
 arch/powerpc/include/asm/checksum.h  | 4 ++--
 arch/s390/include/asm/checksum.h     | 4 ++--
 arch/sh/include/asm/checksum_32.h    | 5 ++---
 arch/sparc/include/asm/checksum_32.h | 4 ++--
 arch/sparc/include/asm/checksum_64.h | 8 ++++++--
 arch/sparc/lib/csum_copy.S           | 2 +-
 arch/x86/include/asm/checksum_32.h   | 5 ++---
 arch/x86/include/asm/checksum_64.h   | 3 +--
 arch/x86/lib/csum-wrappers_64.c      | 4 ++--
 arch/x86/um/asm/checksum.h           | 5 ++---
 arch/xtensa/include/asm/checksum.h   | 5 ++---
 drivers/net/ethernet/3com/typhoon.c  | 3 +--
 include/asm-generic/checksum.h       | 4 ++--
 lib/iov_iter.c                       | 2 +-
 net/core/skbuff.c                    | 4 ++--
 net/ipv4/icmp.c                      | 2 +-
 net/ipv4/ip_output.c                 | 2 +-
 net/ipv4/raw.c                       | 2 +-
 net/ipv6/raw.c                       | 2 +-
 34 files changed, 62 insertions(+), 65 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 0eac81624d01..fdb301fd819b 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -44,7 +44,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
 
-__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 
 /*
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index af1dad74e933..f363dc89fcbe 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -372,13 +372,13 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	__wsum checksum;
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, sum, NULL);
+						dst, len, 0, NULL);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index ed6073fee338..1156b9a9a43b 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -35,7 +35,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 __wsum
 csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
diff --git a/arch/arm/lib/csumpartialcopy.S b/arch/arm/lib/csumpartialcopy.S
index 184d97254a7a..aab914fbc86b 100644
--- a/arch/arm/lib/csumpartialcopy.S
+++ b/arch/arm/lib/csumpartialcopy.S
@@ -9,13 +9,14 @@
 
 		.text
 
-/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len, __u32 sum)
- * Params  : r0 = src, r1 = dst, r2 = len, r3 = checksum
+/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len)
+ * Params  : r0 = src, r1 = dst, r2 = len
  * Returns : r0 = new checksum
  */
 
 		.macro	save_regs
 		stmfd	sp!, {r1, r4 - r8, lr}
+		mov	r3, #0
 		.endm
 
 		.macro	load_regs
diff --git a/arch/hexagon/include/asm/checksum.h b/arch/hexagon/include/asm/checksum.h
index a5c42f4614c1..282e82010b9a 100644
--- a/arch/hexagon/include/asm/checksum.h
+++ b/arch/hexagon/include/asm/checksum.h
@@ -17,8 +17,7 @@ unsigned int do_csum(const void *voidptr, int len);
  * better 64-bit) boundary
  */
 #define csum_partial_copy_nocheck csum_partial_copy_nocheck
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /*
  * computes the checksum of the TCP/UDP pseudo-header
diff --git a/arch/hexagon/lib/checksum.c b/arch/hexagon/lib/checksum.c
index c4a6b72d97de..4d2628bbc987 100644
--- a/arch/hexagon/lib/checksum.c
+++ b/arch/hexagon/lib/checksum.c
@@ -181,9 +181,9 @@ unsigned int do_csum(const void *voidptr, int len)
  * copy from ds while checksumming, otherwise like csum_partial
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/ia64/include/asm/checksum.h b/arch/ia64/include/asm/checksum.h
index 2a1c64629cdc..7e526bde617a 100644
--- a/arch/ia64/include/asm/checksum.h
+++ b/arch/ia64/include/asm/checksum.h
@@ -37,8 +37,7 @@ extern __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
  */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum);
+extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /*
  * This routine is used for miscellaneous IP-like checksums, mainly in
diff --git a/arch/ia64/lib/csum_partial_copy.c b/arch/ia64/lib/csum_partial_copy.c
index 6e82e0be8040..87c39ef108e1 100644
--- a/arch/ia64/lib/csum_partial_copy.c
+++ b/arch/ia64/lib/csum_partial_copy.c
@@ -104,10 +104,10 @@ unsigned long do_csum_c(const unsigned char * buff, int len, unsigned int psum)
  * But it's very tricky to get right even in C.
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index 3f2c15d6f18c..77c61473ee0f 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -37,8 +37,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src,
 						int *csum_err);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
-					      void *dst, int len,
-					      __wsum sum);
+					      void *dst, int len);
 
 /*
  *	This is a version of ip_fast_csum() optimized for IP headers,
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 31797be9a3dc..86ddd2ee187d 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -324,9 +324,10 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	unsigned long tmp1, tmp2;
+	__wsum sum = 0;
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
 		"jeq 2f\n\t"
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index dcebaaf8c862..1dcdd7755793 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -101,8 +101,11 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
  * the same as csum_partial, but copies from user space (but on MIPS
  * we have just one address space, so this is identical to the above)
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 #define csum_partial_copy_nocheck csum_partial_copy_nocheck
 
 /*
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 87fda0713b84..8d70855b0914 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -462,8 +462,8 @@ EXPORT_SYMBOL(csum_partial)
 	lw	errptr, 16(sp)
 #endif
 	.if \__nocheck == 1
-	FEXPORT(csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(csum_partial_copy_nocheck)
+	FEXPORT(__csum_partial_copy_nocheck)
+	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 	.endif
 	move	sum, zero
 	move	odd, zero
diff --git a/arch/nios2/include/asm/checksum.h b/arch/nios2/include/asm/checksum.h
index ec39698d3bea..d3641a70844b 100644
--- a/arch/nios2/include/asm/checksum.h
+++ b/arch/nios2/include/asm/checksum.h
@@ -14,8 +14,8 @@
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 extern __wsum csum_partial_copy(const void *src, void *dst, int len,
 				__wsum sum);
-#define csum_partial_copy_nocheck(src, dst, len, sum)	\
-	csum_partial_copy((src), (dst), (len), (sum))
+#define csum_partial_copy_nocheck(src, dst, len)	\
+	csum_partial_copy((src), (dst), (len), 0)
 
 extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl);
 extern __sum16 ip_compute_csum(const void *buff, int len);
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index fe8c63b2d2c3..b412e3a1bd14 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -24,7 +24,7 @@ extern __wsum csum_partial(const void *, int, __wsum);
  * Here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-extern __wsum csum_partial_copy_nocheck(const void *, void *, int, __wsum);
+extern __wsum csum_partial_copy_nocheck(const void *, void *, int);
 
 /*
  *	Optimized for IP headers, which always checksum on 4 octet boundaries.
diff --git a/arch/parisc/lib/checksum.c b/arch/parisc/lib/checksum.c
index c6f161583549..de1c6942b493 100644
--- a/arch/parisc/lib/checksum.c
+++ b/arch/parisc/lib/checksum.c
@@ -110,16 +110,13 @@ EXPORT_SYMBOL(csum_partial);
 /*
  * copy while checksumming, otherwise like csum_partial
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	/*
 	 * It's 2:30 am and I don't feel like doing it real ...
 	 * This is lots slower than the real thing (tm)
 	 */
-	sum = csum_partial(src, len, sum);
 	memcpy(dst, src, len);
-
-	return sum;
+	return csum_partial(src, len, 0);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 9cce06194dcc..40540b7242a3 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -29,8 +29,8 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len, __wsum sum, int *err_ptr);
 
-#define csum_partial_copy_nocheck(src, dst, len, sum)   \
-        csum_partial_copy_generic((src), (dst), (len), (sum), NULL, NULL)
+#define csum_partial_copy_nocheck(src, dst, len)   \
+        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
 
 
 /*
diff --git a/arch/s390/include/asm/checksum.h b/arch/s390/include/asm/checksum.h
index 6d01c96aeb5c..b3a6ae3af9e9 100644
--- a/arch/s390/include/asm/checksum.h
+++ b/arch/s390/include/asm/checksum.h
@@ -40,10 +40,10 @@ csum_partial(const void *buff, int len, __wsum sum)
 }
 
 static inline __wsum
-csum_partial_copy_nocheck (const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck (const void *src, void *dst, int len)
 {
         memcpy(dst,src,len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 
 /*
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 91571a42e44e..682f88ebb7de 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -42,10 +42,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	access_ok().
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index 479a0b812af5..d21d114436ba 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -42,7 +42,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 unsigned int __csum_partial_copy_sparc_generic (const unsigned char *, unsigned char *);
 
 static inline __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	register unsigned int ret asm("o0") = (unsigned int)src;
 	register char *d asm("o1") = dst;
@@ -52,7 +52,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
 		"call __csum_partial_copy_sparc_generic\n\t"
 		" mov %6, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (sum)
+	: "0" (ret), "1" (d), "2" (l), "r" (0)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 0fa4433f5662..7aebdbe3ac96 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,8 +38,12 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 
 long __csum_partial_copy_from_user(const void __user *src,
 				   void *dst, int len,
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 26c644ba3ecb..72c900d21b12 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	csum_partial_copy_nocheck
+#define FUNC_NAME	__csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 11624c8a9d8d..137a3033edcc 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -38,10 +38,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	If you use these functions directly please don't forget the
  *	access_ok().
  */
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum)
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 0a289b87e872..5339f5dfc776 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -139,8 +139,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 					  int len, __wsum isum, int *errp);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 					int len, __wsum isum, int *errp);
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
+extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
  * ip_compute_csum - Compute an 16bit IP checksum.
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ee63d7576fd2..245f929a1c2c 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -129,9 +129,9 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
  * Returns an 32bit unfolded checksum of the buffer.
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
diff --git a/arch/x86/um/asm/checksum.h b/arch/x86/um/asm/checksum.h
index ff6bba2c8ab6..452e4442ec3f 100644
--- a/arch/x86/um/asm/checksum.h
+++ b/arch/x86/um/asm/checksum.h
@@ -29,11 +29,10 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 static __inline__
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 
 /**
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index d8292cc9ebdf..84e6a36fee6d 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -46,10 +46,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	passed in an incorrect kernel address to one of these functions.
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index 5ed33c2c4742..00c2e7143555 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -1419,8 +1419,7 @@ typhoon_download_firmware(struct typhoon *tp)
 			 * the checksum, we can do this once, at the end.
 			 */
 			csum = csum_fold(csum_partial_copy_nocheck(image_data,
-								   dpage, len,
-								   0));
+								   dpage, len));
 
 			iowrite32(len, ioaddr + TYPHOON_REG_BOOT_LENGTH);
 			iowrite32(le16_to_cpu((__force __le16)csum),
diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h
index 5a80f8e54300..85e5abc1c79c 100644
--- a/include/asm-generic/checksum.h
+++ b/include/asm-generic/checksum.h
@@ -26,8 +26,8 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 extern __wsum csum_partial_copy(const void *src, void *dst, int len, __wsum sum);
 
 #ifndef csum_partial_copy_nocheck
-#define csum_partial_copy_nocheck(src, dst, len, sum)	\
-	csum_partial_copy((src), (dst), (len), (sum))
+#define csum_partial_copy_nocheck(src, dst, len)	\
+	csum_partial_copy((src), (dst), (len), 0)
 #endif
 
 #ifndef ip_fast_csum
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index bf538c2bec77..7405922caaec 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -580,7 +580,7 @@ static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
 static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 			      __wsum sum, size_t off)
 {
-	__wsum next = csum_partial_copy_nocheck(from, to, len, 0);
+	__wsum next = csum_partial_copy_nocheck(from, to, len);
 	return csum_block_add(sum, next, off);
 }
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9c0918651445..6d51fb4312cd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2736,7 +2736,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 		if (copy > len)
 			copy = len;
 		csum = csum_partial_copy_nocheck(skb->data + offset, to,
-						 copy, 0);
+						 copy);
 		if ((len -= copy) == 0)
 			return csum;
 		offset += copy;
@@ -2766,7 +2766,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 				vaddr = kmap_atomic(p);
 				csum2 = csum_partial_copy_nocheck(vaddr + p_off,
 								  to + copied,
-								  p_len, 0);
+								  p_len);
 				kunmap_atomic(vaddr);
 				csum = csum_block_add(csum, csum2, pos);
 				pos += p_len;
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index f93317157549..47a46279ae4c 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -381,7 +381,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 
 		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
 						 (char *)icmph,
-						 icmp_param->head_len, 0);
+						 icmp_param->head_len);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 7fd164754519..f835136b8727 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1642,7 +1642,7 @@ static int ip_reply_glue_bits(void *dptr, char *to, int offset,
 {
 	__wsum csum;
 
-	csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0);
+	csum = csum_partial_copy_nocheck(dptr+offset, to, len);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	return 0;
 }
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 47665919048f..112f983f85fa 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->hdr.c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8ef5a7b30524..b1df7e5fb0a8 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -746,7 +746,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 03/18] csum_partial_copy_nocheck(): drop the last argument
  2020-07-21 20:25   ` [PATCH 03/18] csum_partial_copy_nocheck(): drop the last argument Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

It's always 0.  Note that we could use ~0U as well - result
will be the same modulo 0xffff; later we'll make use of that
whenever convenient.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    | 2 +-
 arch/alpha/lib/csum_partial_copy.c   | 4 ++--
 arch/arm/include/asm/checksum.h      | 2 +-
 arch/arm/lib/csumpartialcopy.S       | 5 +++--
 arch/hexagon/include/asm/checksum.h  | 3 +--
 arch/hexagon/lib/checksum.c          | 4 ++--
 arch/ia64/include/asm/checksum.h     | 3 +--
 arch/ia64/lib/csum_partial_copy.c    | 4 ++--
 arch/m68k/include/asm/checksum.h     | 3 +--
 arch/m68k/lib/checksum.c             | 3 ++-
 arch/mips/include/asm/checksum.h     | 7 +++++--
 arch/mips/lib/csum_partial.S         | 4 ++--
 arch/nios2/include/asm/checksum.h    | 4 ++--
 arch/parisc/include/asm/checksum.h   | 2 +-
 arch/parisc/lib/checksum.c           | 7 ++-----
 arch/powerpc/include/asm/checksum.h  | 4 ++--
 arch/s390/include/asm/checksum.h     | 4 ++--
 arch/sh/include/asm/checksum_32.h    | 5 ++---
 arch/sparc/include/asm/checksum_32.h | 4 ++--
 arch/sparc/include/asm/checksum_64.h | 8 ++++++--
 arch/sparc/lib/csum_copy.S           | 2 +-
 arch/x86/include/asm/checksum_32.h   | 5 ++---
 arch/x86/include/asm/checksum_64.h   | 3 +--
 arch/x86/lib/csum-wrappers_64.c      | 4 ++--
 arch/x86/um/asm/checksum.h           | 5 ++---
 arch/xtensa/include/asm/checksum.h   | 5 ++---
 drivers/net/ethernet/3com/typhoon.c  | 3 +--
 include/asm-generic/checksum.h       | 4 ++--
 lib/iov_iter.c                       | 2 +-
 net/core/skbuff.c                    | 4 ++--
 net/ipv4/icmp.c                      | 2 +-
 net/ipv4/ip_output.c                 | 2 +-
 net/ipv4/raw.c                       | 2 +-
 net/ipv6/raw.c                       | 2 +-
 34 files changed, 62 insertions(+), 65 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 0eac81624d01..fdb301fd819b 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -44,7 +44,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
 
-__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 
 /*
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index af1dad74e933..f363dc89fcbe 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -372,13 +372,13 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	__wsum checksum;
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, sum, NULL);
+						dst, len, 0, NULL);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index ed6073fee338..1156b9a9a43b 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -35,7 +35,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 __wsum
 csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
diff --git a/arch/arm/lib/csumpartialcopy.S b/arch/arm/lib/csumpartialcopy.S
index 184d97254a7a..aab914fbc86b 100644
--- a/arch/arm/lib/csumpartialcopy.S
+++ b/arch/arm/lib/csumpartialcopy.S
@@ -9,13 +9,14 @@
 
 		.text
 
-/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len, __u32 sum)
- * Params  : r0 = src, r1 = dst, r2 = len, r3 = checksum
+/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len)
+ * Params  : r0 = src, r1 = dst, r2 = len
  * Returns : r0 = new checksum
  */
 
 		.macro	save_regs
 		stmfd	sp!, {r1, r4 - r8, lr}
+		mov	r3, #0
 		.endm
 
 		.macro	load_regs
diff --git a/arch/hexagon/include/asm/checksum.h b/arch/hexagon/include/asm/checksum.h
index a5c42f4614c1..282e82010b9a 100644
--- a/arch/hexagon/include/asm/checksum.h
+++ b/arch/hexagon/include/asm/checksum.h
@@ -17,8 +17,7 @@ unsigned int do_csum(const void *voidptr, int len);
  * better 64-bit) boundary
  */
 #define csum_partial_copy_nocheck csum_partial_copy_nocheck
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /*
  * computes the checksum of the TCP/UDP pseudo-header
diff --git a/arch/hexagon/lib/checksum.c b/arch/hexagon/lib/checksum.c
index c4a6b72d97de..4d2628bbc987 100644
--- a/arch/hexagon/lib/checksum.c
+++ b/arch/hexagon/lib/checksum.c
@@ -181,9 +181,9 @@ unsigned int do_csum(const void *voidptr, int len)
  * copy from ds while checksumming, otherwise like csum_partial
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/ia64/include/asm/checksum.h b/arch/ia64/include/asm/checksum.h
index 2a1c64629cdc..7e526bde617a 100644
--- a/arch/ia64/include/asm/checksum.h
+++ b/arch/ia64/include/asm/checksum.h
@@ -37,8 +37,7 @@ extern __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
  */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum);
+extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /*
  * This routine is used for miscellaneous IP-like checksums, mainly in
diff --git a/arch/ia64/lib/csum_partial_copy.c b/arch/ia64/lib/csum_partial_copy.c
index 6e82e0be8040..87c39ef108e1 100644
--- a/arch/ia64/lib/csum_partial_copy.c
+++ b/arch/ia64/lib/csum_partial_copy.c
@@ -104,10 +104,10 @@ unsigned long do_csum_c(const unsigned char * buff, int len, unsigned int psum)
  * But it's very tricky to get right even in C.
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index 3f2c15d6f18c..77c61473ee0f 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -37,8 +37,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src,
 						int *csum_err);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
-					      void *dst, int len,
-					      __wsum sum);
+					      void *dst, int len);
 
 /*
  *	This is a version of ip_fast_csum() optimized for IP headers,
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 31797be9a3dc..86ddd2ee187d 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -324,9 +324,10 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	unsigned long tmp1, tmp2;
+	__wsum sum = 0;
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
 		"jeq 2f\n\t"
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index dcebaaf8c862..1dcdd7755793 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -101,8 +101,11 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
  * the same as csum_partial, but copies from user space (but on MIPS
  * we have just one address space, so this is identical to the above)
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 #define csum_partial_copy_nocheck csum_partial_copy_nocheck
 
 /*
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 87fda0713b84..8d70855b0914 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -462,8 +462,8 @@ EXPORT_SYMBOL(csum_partial)
 	lw	errptr, 16(sp)
 #endif
 	.if \__nocheck == 1
-	FEXPORT(csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(csum_partial_copy_nocheck)
+	FEXPORT(__csum_partial_copy_nocheck)
+	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 	.endif
 	move	sum, zero
 	move	odd, zero
diff --git a/arch/nios2/include/asm/checksum.h b/arch/nios2/include/asm/checksum.h
index ec39698d3bea..d3641a70844b 100644
--- a/arch/nios2/include/asm/checksum.h
+++ b/arch/nios2/include/asm/checksum.h
@@ -14,8 +14,8 @@
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 extern __wsum csum_partial_copy(const void *src, void *dst, int len,
 				__wsum sum);
-#define csum_partial_copy_nocheck(src, dst, len, sum)	\
-	csum_partial_copy((src), (dst), (len), (sum))
+#define csum_partial_copy_nocheck(src, dst, len)	\
+	csum_partial_copy((src), (dst), (len), 0)
 
 extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl);
 extern __sum16 ip_compute_csum(const void *buff, int len);
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index fe8c63b2d2c3..b412e3a1bd14 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -24,7 +24,7 @@ extern __wsum csum_partial(const void *, int, __wsum);
  * Here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-extern __wsum csum_partial_copy_nocheck(const void *, void *, int, __wsum);
+extern __wsum csum_partial_copy_nocheck(const void *, void *, int);
 
 /*
  *	Optimized for IP headers, which always checksum on 4 octet boundaries.
diff --git a/arch/parisc/lib/checksum.c b/arch/parisc/lib/checksum.c
index c6f161583549..de1c6942b493 100644
--- a/arch/parisc/lib/checksum.c
+++ b/arch/parisc/lib/checksum.c
@@ -110,16 +110,13 @@ EXPORT_SYMBOL(csum_partial);
 /*
  * copy while checksumming, otherwise like csum_partial
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	/*
 	 * It's 2:30 am and I don't feel like doing it real ...
 	 * This is lots slower than the real thing (tm)
 	 */
-	sum = csum_partial(src, len, sum);
 	memcpy(dst, src, len);
-
-	return sum;
+	return csum_partial(src, len, 0);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 9cce06194dcc..40540b7242a3 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -29,8 +29,8 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len, __wsum sum, int *err_ptr);
 
-#define csum_partial_copy_nocheck(src, dst, len, sum)   \
-        csum_partial_copy_generic((src), (dst), (len), (sum), NULL, NULL)
+#define csum_partial_copy_nocheck(src, dst, len)   \
+        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
 
 
 /*
diff --git a/arch/s390/include/asm/checksum.h b/arch/s390/include/asm/checksum.h
index 6d01c96aeb5c..b3a6ae3af9e9 100644
--- a/arch/s390/include/asm/checksum.h
+++ b/arch/s390/include/asm/checksum.h
@@ -40,10 +40,10 @@ csum_partial(const void *buff, int len, __wsum sum)
 }
 
 static inline __wsum
-csum_partial_copy_nocheck (const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck (const void *src, void *dst, int len)
 {
         memcpy(dst,src,len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 
 /*
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 91571a42e44e..682f88ebb7de 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -42,10 +42,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	access_ok().
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index 479a0b812af5..d21d114436ba 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -42,7 +42,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 unsigned int __csum_partial_copy_sparc_generic (const unsigned char *, unsigned char *);
 
 static inline __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	register unsigned int ret asm("o0") = (unsigned int)src;
 	register char *d asm("o1") = dst;
@@ -52,7 +52,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
 		"call __csum_partial_copy_sparc_generic\n\t"
 		" mov %6, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (sum)
+	: "0" (ret), "1" (d), "2" (l), "r" (0)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 0fa4433f5662..7aebdbe3ac96 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,8 +38,12 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 
 long __csum_partial_copy_from_user(const void __user *src,
 				   void *dst, int len,
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 26c644ba3ecb..72c900d21b12 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	csum_partial_copy_nocheck
+#define FUNC_NAME	__csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 11624c8a9d8d..137a3033edcc 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -38,10 +38,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	If you use these functions directly please don't forget the
  *	access_ok().
  */
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum)
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 0a289b87e872..5339f5dfc776 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -139,8 +139,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 					  int len, __wsum isum, int *errp);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 					int len, __wsum isum, int *errp);
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
+extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
  * ip_compute_csum - Compute an 16bit IP checksum.
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ee63d7576fd2..245f929a1c2c 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -129,9 +129,9 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
  * Returns an 32bit unfolded checksum of the buffer.
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
diff --git a/arch/x86/um/asm/checksum.h b/arch/x86/um/asm/checksum.h
index ff6bba2c8ab6..452e4442ec3f 100644
--- a/arch/x86/um/asm/checksum.h
+++ b/arch/x86/um/asm/checksum.h
@@ -29,11 +29,10 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 static __inline__
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 
 /**
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index d8292cc9ebdf..84e6a36fee6d 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -46,10 +46,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	passed in an incorrect kernel address to one of these functions.
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index 5ed33c2c4742..00c2e7143555 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -1419,8 +1419,7 @@ typhoon_download_firmware(struct typhoon *tp)
 			 * the checksum, we can do this once, at the end.
 			 */
 			csum = csum_fold(csum_partial_copy_nocheck(image_data,
-								   dpage, len,
-								   0));
+								   dpage, len));
 
 			iowrite32(len, ioaddr + TYPHOON_REG_BOOT_LENGTH);
 			iowrite32(le16_to_cpu((__force __le16)csum),
diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h
index 5a80f8e54300..85e5abc1c79c 100644
--- a/include/asm-generic/checksum.h
+++ b/include/asm-generic/checksum.h
@@ -26,8 +26,8 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 extern __wsum csum_partial_copy(const void *src, void *dst, int len, __wsum sum);
 
 #ifndef csum_partial_copy_nocheck
-#define csum_partial_copy_nocheck(src, dst, len, sum)	\
-	csum_partial_copy((src), (dst), (len), (sum))
+#define csum_partial_copy_nocheck(src, dst, len)	\
+	csum_partial_copy((src), (dst), (len), 0)
 #endif
 
 #ifndef ip_fast_csum
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index bf538c2bec77..7405922caaec 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -580,7 +580,7 @@ static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
 static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 			      __wsum sum, size_t off)
 {
-	__wsum next = csum_partial_copy_nocheck(from, to, len, 0);
+	__wsum next = csum_partial_copy_nocheck(from, to, len);
 	return csum_block_add(sum, next, off);
 }
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9c0918651445..6d51fb4312cd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2736,7 +2736,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 		if (copy > len)
 			copy = len;
 		csum = csum_partial_copy_nocheck(skb->data + offset, to,
-						 copy, 0);
+						 copy);
 		if ((len -= copy) == 0)
 			return csum;
 		offset += copy;
@@ -2766,7 +2766,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 				vaddr = kmap_atomic(p);
 				csum2 = csum_partial_copy_nocheck(vaddr + p_off,
 								  to + copied,
-								  p_len, 0);
+								  p_len);
 				kunmap_atomic(vaddr);
 				csum = csum_block_add(csum, csum2, pos);
 				pos += p_len;
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index f93317157549..47a46279ae4c 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -381,7 +381,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 
 		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
 						 (char *)icmph,
-						 icmp_param->head_len, 0);
+						 icmp_param->head_len);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 7fd164754519..f835136b8727 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1642,7 +1642,7 @@ static int ip_reply_glue_bits(void *dptr, char *to, int offset,
 {
 	__wsum csum;
 
-	csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0);
+	csum = csum_partial_copy_nocheck(dptr+offset, to, len);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	return 0;
 }
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 47665919048f..112f983f85fa 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->hdr.c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8ef5a7b30524..b1df7e5fb0a8 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -746,7 +746,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
  2020-07-21 20:25   ` [PATCH 02/18] icmp_push_reply(): reorder adding the checksum up Al Viro
  2020-07-21 20:25   ` [PATCH 03/18] csum_partial_copy_nocheck(): drop the last argument Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
                       ` (2 more replies)
  2020-07-21 20:25   ` [PATCH 05/18] saner calling conventions for csum_and_copy_..._user() Al Viro
                     ` (13 subsequent siblings)
  16 siblings, 3 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Preparation for the change of calling conventions; right now all
callers pass 0 as initial sum.  Passing 0xffffffff instead yields
the values comparable mod 0xffff and guarantees that 0 will not
be returned on success.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 lib/iov_iter.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7405922caaec..d5b7e204fea6 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1451,7 +1451,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, 0, &err);
+					       v.iov_len, ~0U, &err);
 		if (!err) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
@@ -1493,7 +1493,7 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, 0, &err);
+					       v.iov_len, ~0U, &err);
 		if (err)
 			return false;
 		sum = csum_block_add(sum, next, off);
@@ -1539,7 +1539,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, 0, &err);
+					     v.iov_len, ~0U, &err);
 		if (!err) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:25   ` [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
@ 2020-07-21 20:25     ` Al Viro
  2020-07-21 20:55     ` Linus Torvalds
  2020-07-22  9:27     ` David Laight
  2 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Preparation for the change of calling conventions; right now all
callers pass 0 as initial sum.  Passing 0xffffffff instead yields
the values comparable mod 0xffff and guarantees that 0 will not
be returned on success.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 lib/iov_iter.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7405922caaec..d5b7e204fea6 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1451,7 +1451,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, 0, &err);
+					       v.iov_len, ~0U, &err);
 		if (!err) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
@@ -1493,7 +1493,7 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, 0, &err);
+					       v.iov_len, ~0U, &err);
 		if (err)
 			return false;
 		sum = csum_block_add(sum, next, off);
@@ -1539,7 +1539,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, 0, &err);
+					     v.iov_len, ~0U, &err);
 		if (!err) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 05/18] saner calling conventions for csum_and_copy_..._user()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (2 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 06/18] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers Al Viro
                     ` (12 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

All callers of these primitives will
	* discard anything we might've copied in case of error
	* ignore the csum value in case of error
	* always pass 0xffffffff as the initial sum, so the
resulting csum value (in case of success, that is) will never be 0.

That suggest the following calling conventions:
	* don't pass err_ptr - just return 0 on error.
	* don't bother with zeroing destination, etc. in case of error
	* don't pass the initial sum - just use 0xffffffff.

This commit does the minimal conversion in the instances of csum_and_copy_...();
the changes of actual asm code behind them are done later in the series.
Note that this asm code is often shared with csum_partial_copy_nocheck();
the difference is that csum_partial_copy_nocheck() passes 0 for initial
sum while csum_and_copy_..._user() pass 0xffffffff.  Fortunately, we are
free to pass 0xffffffff in all cases and subsequent patches will use that
freedom without any special comments.

A part that could be split off: parisc and uml/i386 claimed to have
csum_and_copy_to_user() instances of their own, but those were identical
to the generic one, so we simply drop them.  Not sure if it's worth
a separate commit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    |  2 +-
 arch/alpha/lib/csum_partial_copy.c   | 25 ++++++-------
 arch/arm/include/asm/checksum.h      | 13 ++++---
 arch/m68k/include/asm/checksum.h     |  3 +-
 arch/m68k/lib/checksum.c             |  8 ++---
 arch/mips/include/asm/checksum.h     | 46 ++++++++++++------------
 arch/parisc/include/asm/checksum.h   | 20 -----------
 arch/powerpc/include/asm/checksum.h  |  4 +--
 arch/powerpc/lib/checksum_wrappers.c | 68 +++++++++++-------------------------
 arch/sh/include/asm/checksum_32.h    | 36 +++++++++----------
 arch/sparc/include/asm/checksum_32.h | 65 ++++++++++++++++------------------
 arch/sparc/include/asm/checksum_64.h | 14 ++++----
 arch/x86/include/asm/checksum_32.h   | 35 ++++++++-----------
 arch/x86/include/asm/checksum_64.h   |  6 ++--
 arch/x86/lib/csum-wrappers_64.c      | 38 +++++++++-----------
 arch/x86/um/asm/checksum_32.h        | 23 ------------
 arch/xtensa/include/asm/checksum.h   | 30 ++++++++--------
 include/net/checksum.h               | 15 ++++----
 lib/iov_iter.c                       | 19 +++++-----
 19 files changed, 183 insertions(+), 287 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index fdb301fd819b..f8659ae21134 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -42,7 +42,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
-__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index f363dc89fcbe..3c0e89c39ddb 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -325,30 +325,27 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 }
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			       __wsum sum, int *errp)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = (__force u32) sum;
+	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
+	int err = 0;
 
 	if (len) {
-		if (!access_ok(src, len)) {
-			if (errp) *errp = -EFAULT;
-			memset(dst, 0, len);
-			return sum;
-		}
+		if (!access_ok(src, len))
+			return 0;
 		if (!doff) {
 			if (!soff)
 				checksum = csum_partial_cfu_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					len-8, checksum, errp);
+					len-8, checksum, &err);
 			else
 				checksum = csum_partial_cfu_dest_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					soff, len-8, checksum, errp);
+					soff, len-8, checksum, &err);
 		} else {
 			unsigned long partial_dest;
 			ldq_u(partial_dest, dst);
@@ -357,15 +354,15 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 			else
 				checksum = csum_partial_cfu_unaligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					soff, doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 		}
-		checksum = from64to16 (checksum);
+		checksum = err ? 0 : from64to16 (checksum);
 	}
 	return (__force __wsum)checksum;
 }
@@ -378,7 +375,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, 0, NULL);
+						dst, len);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 1156b9a9a43b..737db6c3c482 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -42,16 +42,15 @@ csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum s
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
+	int err = 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(src, len))
+		return 0;
 
-	return sum;
+	sum = csum_partial_copy_from_user(src, dst, len, ~0U, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index 77c61473ee0f..4e13ad046291 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -33,8 +33,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src,
 						void *dst,
-						int len, __wsum sum,
-						int *csum_err);
+						int len);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
 					      void *dst, int len);
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 86ddd2ee187d..3aeca261f622 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -129,8 +129,7 @@ EXPORT_SYMBOL(csum_partial);
  */
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum sum, int *csum_err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
 	/*
 	 * GCC doesn't like more than 10 operands for the asm
@@ -138,6 +137,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 	 * code.
 	 */
 	unsigned long tmp1, tmp2;
+	__wsum sum = ~0U;
 
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
@@ -311,9 +311,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	*csum_err = tmp2;
-
-	return(sum);
+	return tmp2 ? 0 : sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 1dcdd7755793..1e5558f90126 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -60,16 +60,15 @@ __wsum csum_partial_copy_from_user(const void __user *src, void *dst, int len,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum,
-						   err_ptr);
-	if (len)
-		*err_ptr = -EFAULT;
+	__wsum sum = ~0U;
+	int err = 0;
 
-	return sum;
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_from_user(src, dst, len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
@@ -77,24 +76,23 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
  */
 #define HAVE_CSUM_COPY_USER
 static inline
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
+	int err = 0;
+	__wsum sum = ~0U;
+
 	might_fault();
-	if (access_ok(dst, len)) {
-		if (uaccess_kernel())
-			return __csum_partial_copy_kernel(src,
-							  (__force void *)dst,
-							  len, sum, err_ptr);
-		else
-			return __csum_partial_copy_to_user(src,
-							   (__force void *)dst,
-							   len, sum, err_ptr);
-	}
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	if (!access_ok(dst, len))
+		return 0;
+	if (uaccess_kernel())
+		sum = __csum_partial_copy_kernel(src,
+						  (__force void *)dst,
+						  len, sum, &err);
+	else
+		sum = __csum_partial_copy_to_user(src,
+						   (__force void *)dst,
+						   len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index b412e3a1bd14..44c1b4836fb5 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -181,25 +181,5 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/* 
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						      void __user *dst,
-						      int len, __wsum sum,
-						      int *err_ptr)
-{
-	/* code stolen from include/asm-mips64 */
-	sum = csum_partial(src, len, sum);
-	 
-	if (copy_to_user(dst, src, len)) {
-		*err_ptr = -EFAULT;
-		return (__force __wsum)-1;
-	}
-
-	return sum;
-}
-
 #endif
 
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 40540b7242a3..97343e1a7d1c 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -24,10 +24,10 @@ extern __wsum csum_partial_copy_generic(const void *src, void *dst,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr);
+				      int len);
 #define HAVE_CSUM_COPY_USER
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-				    int len, __wsum sum, int *err_ptr);
+				    int len);
 
 #define csum_partial_copy_nocheck(src, dst, len)   \
         csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index fabe4db28726..b1faa82dd8af 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -12,82 +12,56 @@
 #include <linux/uaccess.h>
 
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+			       int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_read_from_user(src, len);
-
-	*err_ptr = 0;
 
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(src, len))) {
-		*err_ptr = -EFAULT;
-		csum = (__force unsigned int)sum;
-		goto out;
-	}
+	allow_read_from_user(src, len);
 
 	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, sum, err_ptr, NULL);
+					 len, ~0U, &err, NULL);
 
-	if (unlikely(*err_ptr)) {
+	if (unlikely(err)) {
 		int missing = __copy_from_user(dst, src, len);
 
-		if (missing) {
-			memset(dst + len - missing, 0, missing);
-			*err_ptr = -EFAULT;
-		} else {
-			*err_ptr = 0;
-		}
-
-		csum = csum_partial(dst, len, sum);
+		if (missing)
+			csum = 0;
+		else
+			csum = csum_partial(dst, len, ~0U);
 	}
 
-out:
 	prevent_read_from_user(src, len);
 	return (__force __wsum)csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_write_to_user(dst, len);
-
-	*err_ptr = 0;
-
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(dst, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(dst, len))) {
-		*err_ptr = -EFAULT;
-		csum = -1; /* invalid checksum */
-		goto out;
-	}
+	allow_write_to_user(dst, len);
 
 	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, sum, NULL, err_ptr);
+					 len, ~0U, NULL, &err);
 
-	if (unlikely(*err_ptr)) {
-		csum = csum_partial(src, len, sum);
+	if (unlikely(err)) {
+		csum = csum_partial(src, len, ~0U);
 
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			csum = -1; /* invalid checksum */
-		}
+		if (copy_to_user(dst, src, len))
+			csum = 0;
 	}
 
-out:
 	prevent_write_to_user(dst, len);
 	return (__force __wsum)csum;
 }
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 682f88ebb7de..97950bdf62e5 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -49,15 +49,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, sum, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -198,16 +199,15 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 #define HAVE_CSUM_COPY_USER
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(dst, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src,
+						dst, len, sum, NULL, &err);
+	return err ? 0 : sum;
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d21d114436ba..b5873b7b7bf0 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -60,19 +60,16 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 }
 
 static inline __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			    __wsum sum, int *err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
   {
 	register unsigned long ret asm("o0") = (unsigned long)src;
 	register char *d asm("o1") = dst;
 	register int l asm("g1") = len;
-	register __wsum s asm("g7") = sum;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-	if (unlikely(!access_ok(src, len))) {
-		if (len)
-			*err = -EFAULT;
-		return sum;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
 	__asm__ __volatile__ (
 	".section __ex_table,#alloc\n\t"
@@ -83,42 +80,40 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 	"call __csum_partial_copy_sparc_generic\n\t"
 	" st %8, [%%sp + 64]\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
 	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
 	  "cc", "memory");
-	return (__force __wsum)ret;
+	return err ? 0 : (__force __wsum)ret;
 }
 
 #define HAVE_CSUM_COPY_USER
 
 static inline __wsum
-csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			  __wsum sum, int *err)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	if (!access_ok(dst, len)) {
-		*err = -EFAULT;
-		return sum;
-	} else {
-		register unsigned long ret asm("o0") = (unsigned long)src;
-		register char __user *d asm("o1") = dst;
-		register int l asm("g1") = len;
-		register __wsum s asm("g7") = sum;
+	register unsigned long ret asm("o0") = (unsigned long)src;
+	register char __user *d asm("o1") = dst;
+	register int l asm("g1") = len;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-		__asm__ __volatile__ (
-		".section __ex_table,#alloc\n\t"
-		".align 4\n\t"
-		".word 1f,1\n\t"
-		".previous\n"
-		"1:\n\t"
-		"call __csum_partial_copy_sparc_generic\n\t"
-		" st %8, [%%sp + 64]\n"
-		: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-		: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
-		: "o2", "o3", "o4", "o5", "o7",
-		  "g2", "g3", "g4", "g5",
-		  "cc", "memory");
-		return (__force __wsum)ret;
-	}
+	if (!access_ok(dst, len))
+		return 0;
+
+	__asm__ __volatile__ (
+	".section __ex_table,#alloc\n\t"
+	".align 4\n\t"
+	".word 1f,1\n\t"
+	".previous\n"
+	"1:\n\t"
+	"call __csum_partial_copy_sparc_generic\n\t"
+	" st %8, [%%sp + 64]\n"
+	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
+	: "o2", "o3", "o4", "o5", "o7",
+	  "g2", "g3", "g4", "g5",
+	  "cc", "memory");
+	return err ? 0 : (__force __wsum)ret;
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 7aebdbe3ac96..4d0bbff43e62 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -51,12 +51,11 @@ long __csum_partial_copy_from_user(const void __user *src,
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len,
-			    __wsum sum, int *err)
+			    void *dst, int len)
 {
-	long ret = __csum_partial_copy_from_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
@@ -70,12 +69,11 @@ long __csum_partial_copy_to_user(const void *src,
 
 static inline __wsum
 csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len,
-		      __wsum sum, int *err)
+		      void __user *dst, int len)
 {
-	long ret = __csum_partial_copy_to_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 137a3033edcc..5948cde9e4ad 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -44,22 +44,19 @@ static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int l
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
-					     void *dst, int len,
-					     __wsum sum, int *err_ptr)
+					     void *dst, int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (!user_access_begin(src, len)) {
-		if (len)
-			*err_ptr = -EFAULT;
-		return sum;
-	}
+	if (!user_access_begin(src, len))
+		return 0;
 	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, sum, err_ptr, NULL);
+					len, ~0U, &err, NULL);
 	user_access_end();
 
-	return ret;
+	return err ? 0 : ret;
 }
 
 /*
@@ -177,23 +174,19 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (user_access_begin(dst, len)) {
-		ret = csum_partial_copy_generic(src, (__force void *)dst,
-						len, sum, NULL, err_ptr);
-		user_access_end();
-		return ret;
-	}
+	if (!user_access_begin(dst, len))
+		return 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	ret = csum_partial_copy_generic(src, (__force void *)dst,
+					len, ~0U, NULL, &err);
+	user_access_end();
+	return err ? 0 : ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 5339f5dfc776..9af3aed54c6b 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -135,10 +135,8 @@ extern __visible __wsum csum_partial_copy_generic(const void *src, const void *d
 					int *src_err_ptr, int *dst_err_ptr);
 
 
-extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-					  int len, __wsum isum, int *errp);
-extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-					int len, __wsum isum, int *errp);
+extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index 245f929a1c2c..ae2fb87e2274 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -22,13 +22,15 @@
  */
 __wsum
 csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum isum, int *errp)
+			    int len)
 {
+	int err = 0;
+	__wsum isum = ~0U;
+
 	might_sleep();
-	*errp = 0;
 
 	if (!user_access_begin(src, len))
-		goto out_err;
+		return 0;
 
 	/*
 	 * Why 6, not 7? To handle odd addresses aligned we
@@ -53,20 +55,15 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		}
 	}
 	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, errp, NULL);
+				dst, len, isum, &err, NULL);
 	user_access_end();
-	if (unlikely(*errp))
-		goto out_err;
-
+	if (unlikely(err))
+		isum = 0;
 	return isum;
 
 out:
 	user_access_end();
-out_err:
-	*errp = -EFAULT;
-	memset(dst, 0, len);
-
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -83,16 +80,15 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 __wsum
 csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len, __wsum isum, int *errp)
+			  int len)
 {
-	__wsum ret;
+	__wsum ret, isum = ~0U;
+	int err = 0;
 
 	might_sleep();
 
-	if (!user_access_begin(dst, len)) {
-		*errp = -EFAULT;
+	if (!user_access_begin(dst, len))
 		return 0;
-	}
 
 	if (unlikely((unsigned long)dst & 6)) {
 		while (((unsigned long)dst & 6) && len >= 2) {
@@ -107,15 +103,13 @@ csum_and_copy_to_user(const void *src, void __user *dst,
 		}
 	}
 
-	*errp = 0;
 	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, errp);
+					len, isum, NULL, &err);
 	user_access_end();
-	return ret;
+	return err ? 0 : ret;
 out:
 	user_access_end();
-	*errp = -EFAULT;
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
diff --git a/arch/x86/um/asm/checksum_32.h b/arch/x86/um/asm/checksum_32.h
index b9ac7c9eb72c..0b13c2947ad1 100644
--- a/arch/x86/um/asm/checksum_32.h
+++ b/arch/x86/um/asm/checksum_32.h
@@ -35,27 +35,4 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						     void __user *dst,
-						     int len, __wsum sum, int *err_ptr)
-{
-	if (access_ok(dst, len)) {
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			return (__force __wsum)-1;
-		}
-
-		return csum_partial(src, len, sum);
-	}
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
-}
-
 #endif
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 84e6a36fee6d..7958b18a5804 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -54,14 +54,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+				   int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+
+	if (!access_ok(dst, len))
+		return 0;
+
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, ~0U, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -242,15 +244,15 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 #define HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
-					       void __user *dst, int len,
-					       __wsum sum, int *err_ptr)
+					       void __user *dst, int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic(src,dst,len,sum,NULL,err_ptr);
+	int err = 0;
+	__wsum sum = ~0U;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(dst, len))
+		return 0;
 
-	return (__force __wsum)-1; /* invalid checksum */
+	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
+	return err ? 0 : sum;
 }
 #endif
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 46754ba9d7b7..5b6664881a1e 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -24,26 +24,23 @@
 #ifndef _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+				      int len)
 {
 	if (copy_from_user(dst, src, len))
-		*err_ptr = -EFAULT;
-	return csum_partial(dst, len, sum);
+		return 0;
+	return csum_partial(dst, len, ~0U);
 }
 #endif
 
 #ifndef HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user
-(const void *src, void __user *dst, int len, __wsum sum, int *err_ptr)
+(const void *src, void __user *dst, int len)
 {
-	sum = csum_partial(src, len, sum);
+	__wsum sum = csum_partial(src, len, ~0U);
 
 	if (copy_to_user(dst, src, len) == 0)
 		return sum;
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	return 0;
 }
 #endif
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d5b7e204fea6..eccb0fe5a498 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1448,15 +1448,14 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (!err) {
+					       v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy((to += v.bv_len) - v.bv_len,
@@ -1490,11 +1489,10 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 	if (unlikely(i->count < bytes))
 		return false;
 	iterate_all_kinds(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (err)
+					       v.iov_len);
+		if (!next)
 			return false;
 		sum = csum_block_add(sum, next, off);
 		off += v.iov_len;
@@ -1536,15 +1534,14 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, ~0U, &err);
-		if (!err) {
+					     v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy(p + v.bv_offset,
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 05/18] saner calling conventions for csum_and_copy_..._user()
  2020-07-21 20:25   ` [PATCH 05/18] saner calling conventions for csum_and_copy_..._user() Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

All callers of these primitives will
	* discard anything we might've copied in case of error
	* ignore the csum value in case of error
	* always pass 0xffffffff as the initial sum, so the
resulting csum value (in case of success, that is) will never be 0.

That suggest the following calling conventions:
	* don't pass err_ptr - just return 0 on error.
	* don't bother with zeroing destination, etc. in case of error
	* don't pass the initial sum - just use 0xffffffff.

This commit does the minimal conversion in the instances of csum_and_copy_...();
the changes of actual asm code behind them are done later in the series.
Note that this asm code is often shared with csum_partial_copy_nocheck();
the difference is that csum_partial_copy_nocheck() passes 0 for initial
sum while csum_and_copy_..._user() pass 0xffffffff.  Fortunately, we are
free to pass 0xffffffff in all cases and subsequent patches will use that
freedom without any special comments.

A part that could be split off: parisc and uml/i386 claimed to have
csum_and_copy_to_user() instances of their own, but those were identical
to the generic one, so we simply drop them.  Not sure if it's worth
a separate commit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    |  2 +-
 arch/alpha/lib/csum_partial_copy.c   | 25 ++++++-------
 arch/arm/include/asm/checksum.h      | 13 ++++---
 arch/m68k/include/asm/checksum.h     |  3 +-
 arch/m68k/lib/checksum.c             |  8 ++---
 arch/mips/include/asm/checksum.h     | 46 ++++++++++++------------
 arch/parisc/include/asm/checksum.h   | 20 -----------
 arch/powerpc/include/asm/checksum.h  |  4 +--
 arch/powerpc/lib/checksum_wrappers.c | 68 +++++++++++-------------------------
 arch/sh/include/asm/checksum_32.h    | 36 +++++++++----------
 arch/sparc/include/asm/checksum_32.h | 65 ++++++++++++++++------------------
 arch/sparc/include/asm/checksum_64.h | 14 ++++----
 arch/x86/include/asm/checksum_32.h   | 35 ++++++++-----------
 arch/x86/include/asm/checksum_64.h   |  6 ++--
 arch/x86/lib/csum-wrappers_64.c      | 38 +++++++++-----------
 arch/x86/um/asm/checksum_32.h        | 23 ------------
 arch/xtensa/include/asm/checksum.h   | 30 ++++++++--------
 include/net/checksum.h               | 15 ++++----
 lib/iov_iter.c                       | 19 +++++-----
 19 files changed, 183 insertions(+), 287 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index fdb301fd819b..f8659ae21134 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -42,7 +42,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
-__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index f363dc89fcbe..3c0e89c39ddb 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -325,30 +325,27 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 }
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			       __wsum sum, int *errp)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = (__force u32) sum;
+	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
+	int err = 0;
 
 	if (len) {
-		if (!access_ok(src, len)) {
-			if (errp) *errp = -EFAULT;
-			memset(dst, 0, len);
-			return sum;
-		}
+		if (!access_ok(src, len))
+			return 0;
 		if (!doff) {
 			if (!soff)
 				checksum = csum_partial_cfu_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					len-8, checksum, errp);
+					len-8, checksum, &err);
 			else
 				checksum = csum_partial_cfu_dest_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					soff, len-8, checksum, errp);
+					soff, len-8, checksum, &err);
 		} else {
 			unsigned long partial_dest;
 			ldq_u(partial_dest, dst);
@@ -357,15 +354,15 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 			else
 				checksum = csum_partial_cfu_unaligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					soff, doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 		}
-		checksum = from64to16 (checksum);
+		checksum = err ? 0 : from64to16 (checksum);
 	}
 	return (__force __wsum)checksum;
 }
@@ -378,7 +375,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, 0, NULL);
+						dst, len);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 1156b9a9a43b..737db6c3c482 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -42,16 +42,15 @@ csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum s
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
+	int err = 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(src, len))
+		return 0;
 
-	return sum;
+	sum = csum_partial_copy_from_user(src, dst, len, ~0U, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index 77c61473ee0f..4e13ad046291 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -33,8 +33,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src,
 						void *dst,
-						int len, __wsum sum,
-						int *csum_err);
+						int len);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
 					      void *dst, int len);
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 86ddd2ee187d..3aeca261f622 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -129,8 +129,7 @@ EXPORT_SYMBOL(csum_partial);
  */
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum sum, int *csum_err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
 	/*
 	 * GCC doesn't like more than 10 operands for the asm
@@ -138,6 +137,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 	 * code.
 	 */
 	unsigned long tmp1, tmp2;
+	__wsum sum = ~0U;
 
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
@@ -311,9 +311,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	*csum_err = tmp2;
-
-	return(sum);
+	return tmp2 ? 0 : sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 1dcdd7755793..1e5558f90126 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -60,16 +60,15 @@ __wsum csum_partial_copy_from_user(const void __user *src, void *dst, int len,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum,
-						   err_ptr);
-	if (len)
-		*err_ptr = -EFAULT;
+	__wsum sum = ~0U;
+	int err = 0;
 
-	return sum;
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_from_user(src, dst, len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
@@ -77,24 +76,23 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
  */
 #define HAVE_CSUM_COPY_USER
 static inline
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
+	int err = 0;
+	__wsum sum = ~0U;
+
 	might_fault();
-	if (access_ok(dst, len)) {
-		if (uaccess_kernel())
-			return __csum_partial_copy_kernel(src,
-							  (__force void *)dst,
-							  len, sum, err_ptr);
-		else
-			return __csum_partial_copy_to_user(src,
-							   (__force void *)dst,
-							   len, sum, err_ptr);
-	}
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	if (!access_ok(dst, len))
+		return 0;
+	if (uaccess_kernel())
+		sum = __csum_partial_copy_kernel(src,
+						  (__force void *)dst,
+						  len, sum, &err);
+	else
+		sum = __csum_partial_copy_to_user(src,
+						   (__force void *)dst,
+						   len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index b412e3a1bd14..44c1b4836fb5 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -181,25 +181,5 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/* 
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						      void __user *dst,
-						      int len, __wsum sum,
-						      int *err_ptr)
-{
-	/* code stolen from include/asm-mips64 */
-	sum = csum_partial(src, len, sum);
-	 
-	if (copy_to_user(dst, src, len)) {
-		*err_ptr = -EFAULT;
-		return (__force __wsum)-1;
-	}
-
-	return sum;
-}
-
 #endif
 
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 40540b7242a3..97343e1a7d1c 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -24,10 +24,10 @@ extern __wsum csum_partial_copy_generic(const void *src, void *dst,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr);
+				      int len);
 #define HAVE_CSUM_COPY_USER
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-				    int len, __wsum sum, int *err_ptr);
+				    int len);
 
 #define csum_partial_copy_nocheck(src, dst, len)   \
         csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index fabe4db28726..b1faa82dd8af 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -12,82 +12,56 @@
 #include <linux/uaccess.h>
 
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+			       int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_read_from_user(src, len);
-
-	*err_ptr = 0;
 
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(src, len))) {
-		*err_ptr = -EFAULT;
-		csum = (__force unsigned int)sum;
-		goto out;
-	}
+	allow_read_from_user(src, len);
 
 	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, sum, err_ptr, NULL);
+					 len, ~0U, &err, NULL);
 
-	if (unlikely(*err_ptr)) {
+	if (unlikely(err)) {
 		int missing = __copy_from_user(dst, src, len);
 
-		if (missing) {
-			memset(dst + len - missing, 0, missing);
-			*err_ptr = -EFAULT;
-		} else {
-			*err_ptr = 0;
-		}
-
-		csum = csum_partial(dst, len, sum);
+		if (missing)
+			csum = 0;
+		else
+			csum = csum_partial(dst, len, ~0U);
 	}
 
-out:
 	prevent_read_from_user(src, len);
 	return (__force __wsum)csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_write_to_user(dst, len);
-
-	*err_ptr = 0;
-
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(dst, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(dst, len))) {
-		*err_ptr = -EFAULT;
-		csum = -1; /* invalid checksum */
-		goto out;
-	}
+	allow_write_to_user(dst, len);
 
 	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, sum, NULL, err_ptr);
+					 len, ~0U, NULL, &err);
 
-	if (unlikely(*err_ptr)) {
-		csum = csum_partial(src, len, sum);
+	if (unlikely(err)) {
+		csum = csum_partial(src, len, ~0U);
 
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			csum = -1; /* invalid checksum */
-		}
+		if (copy_to_user(dst, src, len))
+			csum = 0;
 	}
 
-out:
 	prevent_write_to_user(dst, len);
 	return (__force __wsum)csum;
 }
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 682f88ebb7de..97950bdf62e5 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -49,15 +49,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, sum, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -198,16 +199,15 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 #define HAVE_CSUM_COPY_USER
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(dst, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src,
+						dst, len, sum, NULL, &err);
+	return err ? 0 : sum;
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d21d114436ba..b5873b7b7bf0 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -60,19 +60,16 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 }
 
 static inline __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			    __wsum sum, int *err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
   {
 	register unsigned long ret asm("o0") = (unsigned long)src;
 	register char *d asm("o1") = dst;
 	register int l asm("g1") = len;
-	register __wsum s asm("g7") = sum;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-	if (unlikely(!access_ok(src, len))) {
-		if (len)
-			*err = -EFAULT;
-		return sum;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
 	__asm__ __volatile__ (
 	".section __ex_table,#alloc\n\t"
@@ -83,42 +80,40 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 	"call __csum_partial_copy_sparc_generic\n\t"
 	" st %8, [%%sp + 64]\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
 	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
 	  "cc", "memory");
-	return (__force __wsum)ret;
+	return err ? 0 : (__force __wsum)ret;
 }
 
 #define HAVE_CSUM_COPY_USER
 
 static inline __wsum
-csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			  __wsum sum, int *err)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	if (!access_ok(dst, len)) {
-		*err = -EFAULT;
-		return sum;
-	} else {
-		register unsigned long ret asm("o0") = (unsigned long)src;
-		register char __user *d asm("o1") = dst;
-		register int l asm("g1") = len;
-		register __wsum s asm("g7") = sum;
+	register unsigned long ret asm("o0") = (unsigned long)src;
+	register char __user *d asm("o1") = dst;
+	register int l asm("g1") = len;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-		__asm__ __volatile__ (
-		".section __ex_table,#alloc\n\t"
-		".align 4\n\t"
-		".word 1f,1\n\t"
-		".previous\n"
-		"1:\n\t"
-		"call __csum_partial_copy_sparc_generic\n\t"
-		" st %8, [%%sp + 64]\n"
-		: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-		: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
-		: "o2", "o3", "o4", "o5", "o7",
-		  "g2", "g3", "g4", "g5",
-		  "cc", "memory");
-		return (__force __wsum)ret;
-	}
+	if (!access_ok(dst, len))
+		return 0;
+
+	__asm__ __volatile__ (
+	".section __ex_table,#alloc\n\t"
+	".align 4\n\t"
+	".word 1f,1\n\t"
+	".previous\n"
+	"1:\n\t"
+	"call __csum_partial_copy_sparc_generic\n\t"
+	" st %8, [%%sp + 64]\n"
+	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
+	: "o2", "o3", "o4", "o5", "o7",
+	  "g2", "g3", "g4", "g5",
+	  "cc", "memory");
+	return err ? 0 : (__force __wsum)ret;
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 7aebdbe3ac96..4d0bbff43e62 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -51,12 +51,11 @@ long __csum_partial_copy_from_user(const void __user *src,
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len,
-			    __wsum sum, int *err)
+			    void *dst, int len)
 {
-	long ret = __csum_partial_copy_from_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
@@ -70,12 +69,11 @@ long __csum_partial_copy_to_user(const void *src,
 
 static inline __wsum
 csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len,
-		      __wsum sum, int *err)
+		      void __user *dst, int len)
 {
-	long ret = __csum_partial_copy_to_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 137a3033edcc..5948cde9e4ad 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -44,22 +44,19 @@ static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int l
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
-					     void *dst, int len,
-					     __wsum sum, int *err_ptr)
+					     void *dst, int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (!user_access_begin(src, len)) {
-		if (len)
-			*err_ptr = -EFAULT;
-		return sum;
-	}
+	if (!user_access_begin(src, len))
+		return 0;
 	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, sum, err_ptr, NULL);
+					len, ~0U, &err, NULL);
 	user_access_end();
 
-	return ret;
+	return err ? 0 : ret;
 }
 
 /*
@@ -177,23 +174,19 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (user_access_begin(dst, len)) {
-		ret = csum_partial_copy_generic(src, (__force void *)dst,
-						len, sum, NULL, err_ptr);
-		user_access_end();
-		return ret;
-	}
+	if (!user_access_begin(dst, len))
+		return 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	ret = csum_partial_copy_generic(src, (__force void *)dst,
+					len, ~0U, NULL, &err);
+	user_access_end();
+	return err ? 0 : ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 5339f5dfc776..9af3aed54c6b 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -135,10 +135,8 @@ extern __visible __wsum csum_partial_copy_generic(const void *src, const void *d
 					int *src_err_ptr, int *dst_err_ptr);
 
 
-extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-					  int len, __wsum isum, int *errp);
-extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-					int len, __wsum isum, int *errp);
+extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index 245f929a1c2c..ae2fb87e2274 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -22,13 +22,15 @@
  */
 __wsum
 csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum isum, int *errp)
+			    int len)
 {
+	int err = 0;
+	__wsum isum = ~0U;
+
 	might_sleep();
-	*errp = 0;
 
 	if (!user_access_begin(src, len))
-		goto out_err;
+		return 0;
 
 	/*
 	 * Why 6, not 7? To handle odd addresses aligned we
@@ -53,20 +55,15 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		}
 	}
 	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, errp, NULL);
+				dst, len, isum, &err, NULL);
 	user_access_end();
-	if (unlikely(*errp))
-		goto out_err;
-
+	if (unlikely(err))
+		isum = 0;
 	return isum;
 
 out:
 	user_access_end();
-out_err:
-	*errp = -EFAULT;
-	memset(dst, 0, len);
-
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -83,16 +80,15 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 __wsum
 csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len, __wsum isum, int *errp)
+			  int len)
 {
-	__wsum ret;
+	__wsum ret, isum = ~0U;
+	int err = 0;
 
 	might_sleep();
 
-	if (!user_access_begin(dst, len)) {
-		*errp = -EFAULT;
+	if (!user_access_begin(dst, len))
 		return 0;
-	}
 
 	if (unlikely((unsigned long)dst & 6)) {
 		while (((unsigned long)dst & 6) && len >= 2) {
@@ -107,15 +103,13 @@ csum_and_copy_to_user(const void *src, void __user *dst,
 		}
 	}
 
-	*errp = 0;
 	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, errp);
+					len, isum, NULL, &err);
 	user_access_end();
-	return ret;
+	return err ? 0 : ret;
 out:
 	user_access_end();
-	*errp = -EFAULT;
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
diff --git a/arch/x86/um/asm/checksum_32.h b/arch/x86/um/asm/checksum_32.h
index b9ac7c9eb72c..0b13c2947ad1 100644
--- a/arch/x86/um/asm/checksum_32.h
+++ b/arch/x86/um/asm/checksum_32.h
@@ -35,27 +35,4 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						     void __user *dst,
-						     int len, __wsum sum, int *err_ptr)
-{
-	if (access_ok(dst, len)) {
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			return (__force __wsum)-1;
-		}
-
-		return csum_partial(src, len, sum);
-	}
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
-}
-
 #endif
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 84e6a36fee6d..7958b18a5804 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -54,14 +54,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+				   int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+
+	if (!access_ok(dst, len))
+		return 0;
+
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, ~0U, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -242,15 +244,15 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 #define HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
-					       void __user *dst, int len,
-					       __wsum sum, int *err_ptr)
+					       void __user *dst, int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic(src,dst,len,sum,NULL,err_ptr);
+	int err = 0;
+	__wsum sum = ~0U;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(dst, len))
+		return 0;
 
-	return (__force __wsum)-1; /* invalid checksum */
+	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
+	return err ? 0 : sum;
 }
 #endif
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 46754ba9d7b7..5b6664881a1e 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -24,26 +24,23 @@
 #ifndef _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+				      int len)
 {
 	if (copy_from_user(dst, src, len))
-		*err_ptr = -EFAULT;
-	return csum_partial(dst, len, sum);
+		return 0;
+	return csum_partial(dst, len, ~0U);
 }
 #endif
 
 #ifndef HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user
-(const void *src, void __user *dst, int len, __wsum sum, int *err_ptr)
+(const void *src, void __user *dst, int len)
 {
-	sum = csum_partial(src, len, sum);
+	__wsum sum = csum_partial(src, len, ~0U);
 
 	if (copy_to_user(dst, src, len) == 0)
 		return sum;
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	return 0;
 }
 #endif
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d5b7e204fea6..eccb0fe5a498 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1448,15 +1448,14 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (!err) {
+					       v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy((to += v.bv_len) - v.bv_len,
@@ -1490,11 +1489,10 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 	if (unlikely(i->count < bytes))
 		return false;
 	iterate_all_kinds(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (err)
+					       v.iov_len);
+		if (!next)
 			return false;
 		sum = csum_block_add(sum, next, off);
 		off += v.iov_len;
@@ -1536,15 +1534,14 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, ~0U, &err);
-		if (!err) {
+					     v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy(p + v.bv_offset,
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 06/18] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (3 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 05/18] saner calling conventions for csum_and_copy_..._user() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 07/18] arm: propagate the calling convention changes down to csum_partial_copy_from_user() Al Viro
                     ` (11 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

get rid of set_fs() in csum_partial_copy_nocheck(), while we are at it -
just take the part of csum_and_copy_from_user() sans the access_ok() check
into a helper function and have csum_partial_copy_nocheck() call that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/lib/csum_partial_copy.c | 157 ++++++++++++++++---------------------
 1 file changed, 69 insertions(+), 88 deletions(-)

diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index 3c0e89c39ddb..dc68efbe9367 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -39,12 +39,11 @@ __asm__ __volatile__("insql %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 #define insqh(x,y,z) \
 __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 
-
-#define __get_user_u(x,ptr)				\
+#define __get_word(insn,x,ptr)				\
 ({							\
 	long __guu_err;					\
 	__asm__ __volatile__(				\
-	"1:	ldq_u %0,%2\n"				\
+	"1:	"#insn" %0,%2\n"			\
 	"2:\n"						\
 	EXC(1b,2b,%0,%1)				\
 		: "=r"(x), "=r"(__guu_err)		\
@@ -52,19 +51,6 @@ __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 	__guu_err;					\
 })
 
-#define __put_user_u(x,ptr)				\
-({							\
-	long __puu_err;					\
-	__asm__ __volatile__(				\
-	"1:	stq_u %2,%1\n"				\
-	"2:\n"						\
-	EXC(1b,2b,$31,%0)				\
-		: "=r"(__puu_err)			\
-		: "m"(__m(addr)), "rJ"(x), "0"(0));	\
-	__puu_err;					\
-})
-
-
 static inline unsigned short from64to16(unsigned long x)
 {
 	/* Using extract instructions is a bit more efficient
@@ -95,15 +81,15 @@ static inline unsigned short from64to16(unsigned long x)
  */
 static inline unsigned long
 csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
-			 long len, unsigned long checksum,
-			 int *errp)
+			 long len)
 {
+	unsigned long checksum = ~0U;
 	unsigned long carry = 0;
-	int err = 0;
 
 	while (len >= 0) {
 		unsigned long word;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		checksum += carry;
 		src++;
 		checksum += word;
@@ -116,7 +102,8 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 	checksum += carry;
 	if (len) {
 		unsigned long word, tmp;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		tmp = *dst;
 		mskql(word, len, word);
 		checksum += word;
@@ -125,7 +112,6 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -137,20 +123,21 @@ static inline unsigned long
 csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 			      unsigned long *dst,
 			      unsigned long soff,
-			      long len, unsigned long checksum,
-			      int *errp)
+			      long len)
 {
 	unsigned long first;
 	unsigned long word, carry;
 	unsigned long lastsrc = 7+len+(unsigned long)src;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first,src);
+	if (__get_word(ldq_u, first,src))
+		return 0;
 	carry = 0;
 	while (len >= 0) {
 		unsigned long second;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		len -= 8;
 		src++;
@@ -168,7 +155,8 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 	if (len) {
 		unsigned long tmp;
 		unsigned long second;
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		tmp = *dst;
 		extql(first, soff, word);
 		extqh(second, soff, first);
@@ -180,7 +168,6 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -191,18 +178,18 @@ static inline unsigned long
 csum_partial_cfu_src_aligned(const unsigned long __user *src,
 			     unsigned long *dst,
 			     unsigned long doff,
-			     long len, unsigned long checksum,
-			     unsigned long partial_dest,
-			     int *errp)
+			     long len,
+			     unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long word;
 	unsigned long second_dest;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		len -= 8;
 		insql(word, doff, second_dest);
 		checksum += carry;
@@ -216,7 +203,8 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	len += 8;
 	if (len) {
 		checksum += carry;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		mskql(word, len, word);
 		len -= 8;
 		checksum += word;
@@ -237,7 +225,6 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	stq_u(partial_dest | second_dest, dst);
 out:
 	checksum += carry;
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -249,23 +236,23 @@ static inline unsigned long
 csum_partial_cfu_unaligned(const unsigned long __user * src,
 			   unsigned long * dst,
 			   unsigned long soff, unsigned long doff,
-			   long len, unsigned long checksum,
-			   unsigned long partial_dest,
-			   int *errp)
+			   long len, unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long first;
 	unsigned long lastsrc;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first, src);
+	if (__get_word(ldq_u, first, src))
+		return 0;
 	lastsrc = 7+len+(unsigned long)src;
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		checksum += carry;
 		len -= 8;
@@ -286,7 +273,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -307,7 +295,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -320,63 +309,55 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		stq_u(partial_dest | word | second_dest, dst);
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
-__wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len)
+static __wsum __csum_and_copy(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
-	int err = 0;
-
-	if (len) {
-		if (!access_ok(src, len))
-			return 0;
-		if (!doff) {
-			if (!soff)
-				checksum = csum_partial_cfu_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					len-8, checksum, &err);
-			else
-				checksum = csum_partial_cfu_dest_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, len-8, checksum, &err);
-		} else {
-			unsigned long partial_dest;
-			ldq_u(partial_dest, dst);
-			if (!soff)
-				checksum = csum_partial_cfu_src_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					doff, len-8, checksum,
-					partial_dest, &err);
-			else
-				checksum = csum_partial_cfu_unaligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, doff, len-8, checksum,
-					partial_dest, &err);
-		}
-		checksum = err ? 0 : from64to16 (checksum);
+	unsigned long checksum;
+
+	if (!doff) {
+		if (!soff)
+			checksum = csum_partial_cfu_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst, len-8);
+		else
+			checksum = csum_partial_cfu_dest_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, len-8);
+	} else {
+		unsigned long partial_dest;
+		ldq_u(partial_dest, dst);
+		if (!soff)
+			checksum = csum_partial_cfu_src_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				doff, len-8, partial_dest);
+		else
+			checksum = csum_partial_cfu_unaligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, doff, len-8, partial_dest);
 	}
-	return (__force __wsum)checksum;
+	return (__force __wsum)from64to16 (checksum);
+}
+
+__wsum
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
+{
+	if (!access_ok(src, len))
+		return 0;
+	return __csum_and_copy(src, dst, len);
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	__wsum checksum;
-	mm_segment_t oldfs = get_fs();
-	set_fs(KERNEL_DS);
-	checksum = csum_and_copy_from_user((__force const void __user *)src,
+	return __csum_and_copy((__force const void __user *)src,
 						dst, len);
-	set_fs(oldfs);
-	return checksum;
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 06/18] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
  2020-07-21 20:25   ` [PATCH 06/18] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

get rid of set_fs() in csum_partial_copy_nocheck(), while we are at it -
just take the part of csum_and_copy_from_user() sans the access_ok() check
into a helper function and have csum_partial_copy_nocheck() call that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/lib/csum_partial_copy.c | 157 ++++++++++++++++---------------------
 1 file changed, 69 insertions(+), 88 deletions(-)

diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index 3c0e89c39ddb..dc68efbe9367 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -39,12 +39,11 @@ __asm__ __volatile__("insql %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 #define insqh(x,y,z) \
 __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 
-
-#define __get_user_u(x,ptr)				\
+#define __get_word(insn,x,ptr)				\
 ({							\
 	long __guu_err;					\
 	__asm__ __volatile__(				\
-	"1:	ldq_u %0,%2\n"				\
+	"1:	"#insn" %0,%2\n"			\
 	"2:\n"						\
 	EXC(1b,2b,%0,%1)				\
 		: "=r"(x), "=r"(__guu_err)		\
@@ -52,19 +51,6 @@ __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 	__guu_err;					\
 })
 
-#define __put_user_u(x,ptr)				\
-({							\
-	long __puu_err;					\
-	__asm__ __volatile__(				\
-	"1:	stq_u %2,%1\n"				\
-	"2:\n"						\
-	EXC(1b,2b,$31,%0)				\
-		: "=r"(__puu_err)			\
-		: "m"(__m(addr)), "rJ"(x), "0"(0));	\
-	__puu_err;					\
-})
-
-
 static inline unsigned short from64to16(unsigned long x)
 {
 	/* Using extract instructions is a bit more efficient
@@ -95,15 +81,15 @@ static inline unsigned short from64to16(unsigned long x)
  */
 static inline unsigned long
 csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
-			 long len, unsigned long checksum,
-			 int *errp)
+			 long len)
 {
+	unsigned long checksum = ~0U;
 	unsigned long carry = 0;
-	int err = 0;
 
 	while (len >= 0) {
 		unsigned long word;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		checksum += carry;
 		src++;
 		checksum += word;
@@ -116,7 +102,8 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 	checksum += carry;
 	if (len) {
 		unsigned long word, tmp;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		tmp = *dst;
 		mskql(word, len, word);
 		checksum += word;
@@ -125,7 +112,6 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -137,20 +123,21 @@ static inline unsigned long
 csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 			      unsigned long *dst,
 			      unsigned long soff,
-			      long len, unsigned long checksum,
-			      int *errp)
+			      long len)
 {
 	unsigned long first;
 	unsigned long word, carry;
 	unsigned long lastsrc = 7+len+(unsigned long)src;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first,src);
+	if (__get_word(ldq_u, first,src))
+		return 0;
 	carry = 0;
 	while (len >= 0) {
 		unsigned long second;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		len -= 8;
 		src++;
@@ -168,7 +155,8 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 	if (len) {
 		unsigned long tmp;
 		unsigned long second;
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		tmp = *dst;
 		extql(first, soff, word);
 		extqh(second, soff, first);
@@ -180,7 +168,6 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -191,18 +178,18 @@ static inline unsigned long
 csum_partial_cfu_src_aligned(const unsigned long __user *src,
 			     unsigned long *dst,
 			     unsigned long doff,
-			     long len, unsigned long checksum,
-			     unsigned long partial_dest,
-			     int *errp)
+			     long len,
+			     unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long word;
 	unsigned long second_dest;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		len -= 8;
 		insql(word, doff, second_dest);
 		checksum += carry;
@@ -216,7 +203,8 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	len += 8;
 	if (len) {
 		checksum += carry;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		mskql(word, len, word);
 		len -= 8;
 		checksum += word;
@@ -237,7 +225,6 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	stq_u(partial_dest | second_dest, dst);
 out:
 	checksum += carry;
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -249,23 +236,23 @@ static inline unsigned long
 csum_partial_cfu_unaligned(const unsigned long __user * src,
 			   unsigned long * dst,
 			   unsigned long soff, unsigned long doff,
-			   long len, unsigned long checksum,
-			   unsigned long partial_dest,
-			   int *errp)
+			   long len, unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long first;
 	unsigned long lastsrc;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first, src);
+	if (__get_word(ldq_u, first, src))
+		return 0;
 	lastsrc = 7+len+(unsigned long)src;
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		checksum += carry;
 		len -= 8;
@@ -286,7 +273,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -307,7 +295,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -320,63 +309,55 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		stq_u(partial_dest | word | second_dest, dst);
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
-__wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len)
+static __wsum __csum_and_copy(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
-	int err = 0;
-
-	if (len) {
-		if (!access_ok(src, len))
-			return 0;
-		if (!doff) {
-			if (!soff)
-				checksum = csum_partial_cfu_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					len-8, checksum, &err);
-			else
-				checksum = csum_partial_cfu_dest_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, len-8, checksum, &err);
-		} else {
-			unsigned long partial_dest;
-			ldq_u(partial_dest, dst);
-			if (!soff)
-				checksum = csum_partial_cfu_src_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					doff, len-8, checksum,
-					partial_dest, &err);
-			else
-				checksum = csum_partial_cfu_unaligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, doff, len-8, checksum,
-					partial_dest, &err);
-		}
-		checksum = err ? 0 : from64to16 (checksum);
+	unsigned long checksum;
+
+	if (!doff) {
+		if (!soff)
+			checksum = csum_partial_cfu_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst, len-8);
+		else
+			checksum = csum_partial_cfu_dest_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, len-8);
+	} else {
+		unsigned long partial_dest;
+		ldq_u(partial_dest, dst);
+		if (!soff)
+			checksum = csum_partial_cfu_src_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				doff, len-8, partial_dest);
+		else
+			checksum = csum_partial_cfu_unaligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, doff, len-8, partial_dest);
 	}
-	return (__force __wsum)checksum;
+	return (__force __wsum)from64to16 (checksum);
+}
+
+__wsum
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
+{
+	if (!access_ok(src, len))
+		return 0;
+	return __csum_and_copy(src, dst, len);
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	__wsum checksum;
-	mm_segment_t oldfs = get_fs();
-	set_fs(KERNEL_DS);
-	checksum = csum_and_copy_from_user((__force const void __user *)src,
+	return __csum_and_copy((__force const void __user *)src,
 						dst, len);
-	set_fs(oldfs);
-	return checksum;
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 07/18] arm: propagate the calling convention changes down to csum_partial_copy_from_user()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (4 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 06/18] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25   ` [PATCH 08/18] m68k: get rid of zeroing destination on error in csum_and_copy_from_user() Al Viro
                     ` (10 subsequent siblings)
  16 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of the "clean the destination on error" crap.
Simplifies the fault handlers and the function itself...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/arm/include/asm/checksum.h       |  7 ++-----
 arch/arm/lib/csumpartialcopy.S        |  1 -
 arch/arm/lib/csumpartialcopygeneric.S |  1 +
 arch/arm/lib/csumpartialcopyuser.S    | 26 ++++++--------------------
 4 files changed, 9 insertions(+), 26 deletions(-)

diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 737db6c3c482..d434986e81b2 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -38,19 +38,16 @@ __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 __wsum
-csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
+csum_partial_copy_from_user(const void __user *src, void *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-
 	if (!access_ok(src, len))
 		return 0;
 
-	sum = csum_partial_copy_from_user(src, dst, len, ~0U, &err);
-	return err ? 0 : sum;
+	return csum_partial_copy_from_user(src, dst, len);
 }
 
 /*
diff --git a/arch/arm/lib/csumpartialcopy.S b/arch/arm/lib/csumpartialcopy.S
index aab914fbc86b..1ca6aadd649c 100644
--- a/arch/arm/lib/csumpartialcopy.S
+++ b/arch/arm/lib/csumpartialcopy.S
@@ -16,7 +16,6 @@
 
 		.macro	save_regs
 		stmfd	sp!, {r1, r4 - r8, lr}
-		mov	r3, #0
 		.endm
 
 		.macro	load_regs
diff --git a/arch/arm/lib/csumpartialcopygeneric.S b/arch/arm/lib/csumpartialcopygeneric.S
index 0b706a39a677..0fd5c10e90a7 100644
--- a/arch/arm/lib/csumpartialcopygeneric.S
+++ b/arch/arm/lib/csumpartialcopygeneric.S
@@ -86,6 +86,7 @@ sum	.req	r3
 
 FN_ENTRY
 		save_regs
+		mov	sum, #-1
 
 		cmp	len, #8			@ Ensure that we have at least
 		blo	.Lless8			@ 8 bytes to copy.
diff --git a/arch/arm/lib/csumpartialcopyuser.S b/arch/arm/lib/csumpartialcopyuser.S
index 6bd3a93eaa3c..6928781e6bee 100644
--- a/arch/arm/lib/csumpartialcopyuser.S
+++ b/arch/arm/lib/csumpartialcopyuser.S
@@ -62,9 +62,9 @@
 
 /*
  * unsigned int
- * csum_partial_copy_from_user(const char *src, char *dst, int len, int sum, int *err_ptr)
- *  r0 = src, r1 = dst, r2 = len, r3 = sum, [sp] = *err_ptr
- *  Returns : r0 = checksum, [[sp, #0], #0] = 0 or -EFAULT
+ * csum_partial_copy_from_user(const char *src, char *dst, int len)
+ *  r0 = src, r1 = dst, r2 = len
+ *  Returns : r0 = checksum or 0
  */
 
 #define FN_ENTRY	ENTRY(csum_partial_copy_from_user)
@@ -73,25 +73,11 @@
 #include "csumpartialcopygeneric.S"
 
 /*
- * FIXME: minor buglet here
- * We don't return the checksum for the data present in the buffer.  To do
- * so properly, we would have to add in whatever registers were loaded before
- * the fault, which, with the current asm above is not predictable.
+ * We report fault by returning 0 csum - impossible in normal case, since
+ * we start with 0xffffffff for initial sum.
  */
 		.pushsection .text.fixup,"ax"
 		.align	4
-9001:		mov	r4, #-EFAULT
-#ifdef CONFIG_CPU_SW_DOMAIN_PAN
-		ldr	r5, [sp, #9*4]		@ *err_ptr
-#else
-		ldr	r5, [sp, #8*4]		@ *err_ptr
-#endif
-		str	r4, [r5]
-		ldmia	sp, {r1, r2}		@ retrieve dst, len
-		add	r2, r2, r1
-		mov	r0, #0			@ zero the buffer
-9002:		teq	r2, r1
-		strbne	r0, [r1], #1
-		bne	9002b
+9001:		mov	r0, #0
 		load_regs
 		.popsection
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 08/18] m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (5 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 07/18] arm: propagate the calling convention changes down to csum_partial_copy_from_user() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 09/18] sh: propage the calling conventions change down to csum_partial_copy_generic() Al Viro
                     ` (9 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/lib/checksum.c | 79 +++++++++---------------------------------------
 1 file changed, 15 insertions(+), 64 deletions(-)

diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 3aeca261f622..7e6afeae6217 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -236,82 +236,33 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 		"clrl %5\n\t"
 		"addxl %5,%0\n\t"	/* add X bit */
 	     "7:\t"
-		"clrl %5\n"		/* no error - clear return value */
-	     "8:\n"
 		".section .fixup,\"ax\"\n"
 		".even\n"
-		/* If any exception occurs zero out the rest.
-		   Similarities with the code above are intentional :-) */
+		/* If any exception occurs, return 0 */
 	     "90:\t"
-		"clrw %3@+\n\t"
-		"movel %1,%4\n\t"
-		"lsrl #5,%1\n\t"
-		"jeq 1f\n\t"
-		"subql #1,%1\n"
-	     "91:\t"
-		"clrl %3@+\n"
-	     "92:\t"
-		"clrl %3@+\n"
-	     "93:\t"
-		"clrl %3@+\n"
-	     "94:\t"
-		"clrl %3@+\n"
-	     "95:\t"
-		"clrl %3@+\n"
-	     "96:\t"
-		"clrl %3@+\n"
-	     "97:\t"
-		"clrl %3@+\n"
-	     "98:\t"
-		"clrl %3@+\n\t"
-		"dbra %1,91b\n\t"
-		"clrw %1\n\t"
-		"subql #1,%1\n\t"
-		"jcc 91b\n"
-	     "1:\t"
-		"movel %4,%1\n\t"
-		"andw #0x1c,%4\n\t"
-		"jeq 1f\n\t"
-		"lsrw #2,%4\n\t"
-		"subqw #1,%4\n"
-	     "99:\t"
-		"clrl %3@+\n\t"
-		"dbra %4,99b\n\t"
-	     "1:\t"
-		"andw #3,%1\n\t"
-		"jeq 9f\n"
-	     "100:\t"
-		"clrw %3@+\n\t"
-		"tstw %1\n\t"
-		"jeq 9f\n"
-	     "101:\t"
-		"clrb %3@+\n"
-	     "9:\t"
-#define STR(X) STR1(X)
-#define STR1(X) #X
-		"moveq #-" STR(EFAULT) ",%5\n\t"
-		"jra 8b\n"
+		"clrl %0\n"
+		"jra 7b\n"
 		".previous\n"
 		".section __ex_table,\"a\"\n"
 		".long 10b,90b\n"
-		".long 11b,91b\n"
-		".long 12b,92b\n"
-		".long 13b,93b\n"
-		".long 14b,94b\n"
-		".long 15b,95b\n"
-		".long 16b,96b\n"
-		".long 17b,97b\n"
-		".long 18b,98b\n"
-		".long 19b,99b\n"
-		".long 20b,100b\n"
-		".long 21b,101b\n"
+		".long 11b,90b\n"
+		".long 12b,90b\n"
+		".long 13b,90b\n"
+		".long 14b,90b\n"
+		".long 15b,90b\n"
+		".long 16b,90b\n"
+		".long 17b,90b\n"
+		".long 18b,90b\n"
+		".long 19b,90b\n"
+		".long 20b,90b\n"
+		".long 21b,90b\n"
 		".previous"
 		: "=d" (sum), "=d" (len), "=a" (src), "=a" (dst),
 		  "=&d" (tmp1), "=d" (tmp2)
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	return tmp2 ? 0 : sum;
+	return sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 08/18] m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
  2020-07-21 20:25   ` [PATCH 08/18] m68k: get rid of zeroing destination on error in csum_and_copy_from_user() Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/lib/checksum.c | 79 +++++++++---------------------------------------
 1 file changed, 15 insertions(+), 64 deletions(-)

diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 3aeca261f622..7e6afeae6217 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -236,82 +236,33 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 		"clrl %5\n\t"
 		"addxl %5,%0\n\t"	/* add X bit */
 	     "7:\t"
-		"clrl %5\n"		/* no error - clear return value */
-	     "8:\n"
 		".section .fixup,\"ax\"\n"
 		".even\n"
-		/* If any exception occurs zero out the rest.
-		   Similarities with the code above are intentional :-) */
+		/* If any exception occurs, return 0 */
 	     "90:\t"
-		"clrw %3@+\n\t"
-		"movel %1,%4\n\t"
-		"lsrl #5,%1\n\t"
-		"jeq 1f\n\t"
-		"subql #1,%1\n"
-	     "91:\t"
-		"clrl %3@+\n"
-	     "92:\t"
-		"clrl %3@+\n"
-	     "93:\t"
-		"clrl %3@+\n"
-	     "94:\t"
-		"clrl %3@+\n"
-	     "95:\t"
-		"clrl %3@+\n"
-	     "96:\t"
-		"clrl %3@+\n"
-	     "97:\t"
-		"clrl %3@+\n"
-	     "98:\t"
-		"clrl %3@+\n\t"
-		"dbra %1,91b\n\t"
-		"clrw %1\n\t"
-		"subql #1,%1\n\t"
-		"jcc 91b\n"
-	     "1:\t"
-		"movel %4,%1\n\t"
-		"andw #0x1c,%4\n\t"
-		"jeq 1f\n\t"
-		"lsrw #2,%4\n\t"
-		"subqw #1,%4\n"
-	     "99:\t"
-		"clrl %3@+\n\t"
-		"dbra %4,99b\n\t"
-	     "1:\t"
-		"andw #3,%1\n\t"
-		"jeq 9f\n"
-	     "100:\t"
-		"clrw %3@+\n\t"
-		"tstw %1\n\t"
-		"jeq 9f\n"
-	     "101:\t"
-		"clrb %3@+\n"
-	     "9:\t"
-#define STR(X) STR1(X)
-#define STR1(X) #X
-		"moveq #-" STR(EFAULT) ",%5\n\t"
-		"jra 8b\n"
+		"clrl %0\n"
+		"jra 7b\n"
 		".previous\n"
 		".section __ex_table,\"a\"\n"
 		".long 10b,90b\n"
-		".long 11b,91b\n"
-		".long 12b,92b\n"
-		".long 13b,93b\n"
-		".long 14b,94b\n"
-		".long 15b,95b\n"
-		".long 16b,96b\n"
-		".long 17b,97b\n"
-		".long 18b,98b\n"
-		".long 19b,99b\n"
-		".long 20b,100b\n"
-		".long 21b,101b\n"
+		".long 11b,90b\n"
+		".long 12b,90b\n"
+		".long 13b,90b\n"
+		".long 14b,90b\n"
+		".long 15b,90b\n"
+		".long 16b,90b\n"
+		".long 17b,90b\n"
+		".long 18b,90b\n"
+		".long 19b,90b\n"
+		".long 20b,90b\n"
+		".long 21b,90b\n"
 		".previous"
 		: "=d" (sum), "=d" (len), "=a" (src), "=a" (dst),
 		  "=&d" (tmp1), "=d" (tmp2)
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	return tmp2 ? 0 : sum;
+	return sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 09/18] sh: propage the calling conventions change down to csum_partial_copy_generic()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (6 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 08/18] m68k: get rid of zeroing destination on error in csum_and_copy_from_user() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25   ` [PATCH 10/18] i386: propagate " Al Viro
                     ` (8 subsequent siblings)
  16 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing destination on error there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sh/include/asm/checksum_32.h |  20 ++-----
 arch/sh/lib/checksum.S            | 119 +++++++++++---------------------------
 2 files changed, 39 insertions(+), 100 deletions(-)

diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 97950bdf62e5..07e7f6b1ef92 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -30,9 +30,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 /*
  *	Note: when you get a NULL pointer exception here this means someone
@@ -44,21 +42,16 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 static inline
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, &err, NULL);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 
 /*
@@ -201,13 +194,8 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
 					   int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(dst, len))
 		return 0;
-	sum = csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, &err);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sh/lib/checksum.S b/arch/sh/lib/checksum.S
index 97b5c2d9fec4..3e07074e0098 100644
--- a/arch/sh/lib/checksum.S
+++ b/arch/sh/lib/checksum.S
@@ -173,47 +173,27 @@ ENTRY(csum_partial)
 	 mov	r6, r0
 
 /*
-unsigned int csum_partial_copy_generic (const char *src, char *dst, int len, 
-					int sum, int *src_err_ptr, int *dst_err_ptr)
+unsigned int csum_partial_copy_generic (const char *src, char *dst, int len)
  */ 
 
 /*
- * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for all access types.
- *
- * FIXME: could someone double-check whether I haven't mixed up some SRC and
- *	  DST definitions? It's damn hard to trigger all cases.  I hope I got
- *	  them all but there's no guarantee.
+ * Copy from ds while checksumming, otherwise like csum_partial with initial
+ * sum being ~0U
  */
 
-#define SRC(...)			\
+#define EXC(...)			\
 	9999: __VA_ARGS__ ;		\
 	.section __ex_table, "a";	\
 	.long 9999b, 6001f	;	\
 	.previous
 
-#define DST(...)			\
-	9999: __VA_ARGS__ ;		\
-	.section __ex_table, "a";	\
-	.long 9999b, 6002f	;	\
-	.previous
-
 !
 ! r4:	const char *SRC
 ! r5:	char *DST
 ! r6:	int LEN
-! r7:	int SUM
-!
-! on stack:
-! int *SRC_ERR_PTR
-! int *DST_ERR_PTR
 !
 ENTRY(csum_partial_copy_generic)
-	mov.l	r5,@-r15
-	mov.l	r6,@-r15
-
+	mov	#-1,r7
 	mov	#3,r0		! Check src and dest are equally aligned
 	mov	r4,r1
 	and	r0,r1
@@ -243,11 +223,11 @@ ENTRY(csum_partial_copy_generic)
 	clrt
 	.align	2
 5:
-SRC(	mov.b	@r4+,r1 	)
-SRC(	mov.b	@r4+,r0		)
+EXC(	mov.b	@r4+,r1 	)
+EXC(	mov.b	@r4+,r0		)
 	extu.b	r1,r1
-DST(	mov.b	r1,@r5		)
-DST(	mov.b	r0,@(1,r5)	)
+EXC(	mov.b	r1,@r5		)
+EXC(	mov.b	r0,@(1,r5)	)
 	extu.b	r0,r0
 	add	#2,r5
 
@@ -276,8 +256,8 @@ DST(	mov.b	r0,@(1,r5)	)
 	! Handle first two bytes as a special case
 	.align	2
 1:	
-SRC(	mov.w	@r4+,r0		)
-DST(	mov.w	r0,@r5		)
+EXC(	mov.w	@r4+,r0		)
+EXC(	mov.w	r0,@r5		)
 	add	#2,r5
 	extu.w	r0,r0
 	addc	r0,r7
@@ -292,32 +272,32 @@ DST(	mov.w	r0,@r5		)
 	 clrt
 	.align	2
 1:	
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@r5		)
-DST(	mov.l	r1,@(4,r5)	)
+EXC(	mov.l	r0,@r5		)
+EXC(	mov.l	r1,@(4,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(8,r5)	)
-DST(	mov.l	r1,@(12,r5)	)
+EXC(	mov.l	r0,@(8,r5)	)
+EXC(	mov.l	r1,@(12,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0 	)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0 	)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(16,r5)	)
-DST(	mov.l	r1,@(20,r5)	)
+EXC(	mov.l	r0,@(16,r5)	)
+EXC(	mov.l	r1,@(20,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(24,r5)	)
-DST(	mov.l	r1,@(28,r5)	)
+EXC(	mov.l	r0,@(24,r5)	)
+EXC(	mov.l	r1,@(28,r5)	)
 	addc	r1,r7
 	add	#32,r5
 	movt	r0
@@ -335,9 +315,9 @@ DST(	mov.l	r1,@(28,r5)	)
 	 clrt
 	shlr2	r6
 3:	
-SRC(	mov.l	@r4+,r0	)
+EXC(	mov.l	@r4+,r0	)
 	addc	r0,r7
-DST(	mov.l	r0,@r5	)
+EXC(	mov.l	r0,@r5	)
 	add	#4,r5
 	movt	r0
 	dt	r6
@@ -353,8 +333,8 @@ DST(	mov.l	r0,@r5	)
 	mov	#2,r1
 	cmp/hs	r1,r6
 	bf	5f
-SRC(	mov.w	@r4+,r0	)
-DST(	mov.w	r0,@r5	)
+EXC(	mov.w	@r4+,r0	)
+EXC(	mov.w	r0,@r5	)
 	extu.w	r0,r0
 	add	#2,r5
 	cmp/eq	r1,r6
@@ -363,8 +343,8 @@ DST(	mov.w	r0,@r5	)
 	shll16	r0
 	addc	r0,r7
 5:	
-SRC(	mov.b	@r4+,r0	)
-DST(	mov.b	r0,@r5	)
+EXC(	mov.b	@r4+,r0	)
+EXC(	mov.b	r0,@r5	)
 	extu.b	r0,r0
 #ifndef	__LITTLE_ENDIAN__
 	shll8	r0
@@ -373,42 +353,13 @@ DST(	mov.b	r0,@r5	)
 	mov	#0,r0
 	addc	r0,r7
 7:
-5000:
 
 # Exception handler:
 .section .fixup, "ax"							
 
 6001:
-	mov.l	@(8,r15),r0			! src_err_ptr
-	mov	#-EFAULT,r1
-	mov.l	r1,@r0
-
-	! zero the complete destination - computing the rest
-	! is too much work 
-	mov.l	@(4,r15),r5		! dst
-	mov.l	@r15,r6			! len
-	mov	#0,r7
-1:	mov.b	r7,@r5
-	dt	r6
-	bf/s	1b
-	 add	#1,r5
-	mov.l	8000f,r0
-	jmp	@r0
-	 nop
-	.align	2
-8000:	.long	5000b
-
-6002:
-	mov.l	@(12,r15),r0			! dst_err_ptr
-	mov	#-EFAULT,r1
-	mov.l	r1,@r0
-	mov.l	8001f,r0
-	jmp	@r0
-	 nop
-	.align	2
-8001:	.long	5000b
-
+	rts
+	 mov	#0,r0
 .previous
-	add	#8,r15
 	rts
 	 mov	r7,r0
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 10/18] i386: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (7 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 09/18] sh: propage the calling conventions change down to csum_partial_copy_generic() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
                     ` (7 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and don't bother zeroing destination on error

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_32.h |  18 ++----
 arch/x86/lib/checksum_32.S         | 117 +++++++++++++------------------------
 2 files changed, 47 insertions(+), 88 deletions(-)

diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 5948cde9e4ad..17da95387997 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -27,9 +27,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 /*
  *	Note: when you get a NULL pointer exception here this means someone
@@ -40,23 +38,21 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  */
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
 					     void *dst, int len)
 {
 	__wsum ret;
-	int err = 0;
 
 	might_sleep();
 	if (!user_access_begin(src, len))
 		return 0;
-	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, ~0U, &err, NULL);
+	ret = csum_partial_copy_generic((__force void *)src, dst, len);
 	user_access_end();
 
-	return err ? 0 : ret;
+	return ret;
 }
 
 /*
@@ -177,16 +173,14 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   int len)
 {
 	__wsum ret;
-	int err = 0;
 
 	might_sleep();
 	if (!user_access_begin(dst, len))
 		return 0;
 
-	ret = csum_partial_copy_generic(src, (__force void *)dst,
-					len, ~0U, NULL, &err);
+	ret = csum_partial_copy_generic(src, (__force void *)dst, len);
 	user_access_end();
-	return err ? 0 : ret;
+	return ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index d1d768912368..4304320e51f4 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -253,28 +253,17 @@ EXPORT_SYMBOL(csum_partial)
 
 /*
 unsigned int csum_partial_copy_generic (const char *src, char *dst,
-				  int len, int sum, int *src_err_ptr, int *dst_err_ptr)
+				  int len)
  */ 
 
 /*
  * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for all access types.
- *
- * FIXME: could someone double-check whether I haven't mixed up some SRC and
- *	  DST definitions? It's damn hard to trigger all cases.  I hope I got
- *	  them all but there's no guarantee.
  */
 
-#define SRC(y...)			\
+#define EXC(y...)			\
 	9999: y;			\
 	_ASM_EXTABLE_UA(9999b, 6001f)
 
-#define DST(y...)			\
-	9999: y;			\
-	_ASM_EXTABLE_UA(9999b, 6002f)
-
 #ifndef CONFIG_X86_USE_PPRO_CHECKSUM
 
 #define ARGBASE 16		
@@ -285,20 +274,20 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	pushl %edi
 	pushl %esi
 	pushl %ebx
-	movl ARGBASE+16(%esp),%eax	# sum
 	movl ARGBASE+12(%esp),%ecx	# len
 	movl ARGBASE+4(%esp),%esi	# src
 	movl ARGBASE+8(%esp),%edi	# dst
 
+	movl $-1, %eax			# sum
 	testl $2, %edi			# Check alignment. 
 	jz 2f				# Jump if alignment is ok.
 	subl $2, %ecx			# Alignment uses up two bytes.
 	jae 1f				# Jump if we had at least two bytes.
 	addl $2, %ecx			# ecx was < 2.  Deal with it.
 	jmp 4f
-SRC(1:	movw (%esi), %bx	)
+EXC(1:	movw (%esi), %bx	)
 	addl $2, %esi
-DST(	movw %bx, (%edi)	)
+EXC(	movw %bx, (%edi)	)
 	addl $2, %edi
 	addw %bx, %ax	
 	adcl $0, %eax
@@ -306,34 +295,34 @@ DST(	movw %bx, (%edi)	)
 	movl %ecx, FP(%esp)
 	shrl $5, %ecx
 	jz 2f
-	testl %esi, %esi
-SRC(1:	movl (%esi), %ebx	)
-SRC(	movl 4(%esi), %edx	)
+	testl %esi, %esi		# what's wrong with clc?
+EXC(1:	movl (%esi), %ebx	)
+EXC(	movl 4(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, (%edi)	)
+EXC(	movl %ebx, (%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 4(%edi)	)
+EXC(	movl %edx, 4(%edi)	)
 
-SRC(	movl 8(%esi), %ebx	)
-SRC(	movl 12(%esi), %edx	)
+EXC(	movl 8(%esi), %ebx	)
+EXC(	movl 12(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 8(%edi)	)
+EXC(	movl %ebx, 8(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 12(%edi)	)
+EXC(	movl %edx, 12(%edi)	)
 
-SRC(	movl 16(%esi), %ebx 	)
-SRC(	movl 20(%esi), %edx	)
+EXC(	movl 16(%esi), %ebx 	)
+EXC(	movl 20(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 16(%edi)	)
+EXC(	movl %ebx, 16(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 20(%edi)	)
+EXC(	movl %edx, 20(%edi)	)
 
-SRC(	movl 24(%esi), %ebx	)
-SRC(	movl 28(%esi), %edx	)
+EXC(	movl 24(%esi), %ebx	)
+EXC(	movl 28(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 24(%edi)	)
+EXC(	movl %ebx, 24(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 28(%edi)	)
+EXC(	movl %edx, 28(%edi)	)
 
 	lea 32(%esi), %esi
 	lea 32(%edi), %edi
@@ -345,9 +334,9 @@ DST(	movl %edx, 28(%edi)	)
 	andl $0x1c, %edx
 	je 4f
 	shrl $2, %edx			# This clears CF
-SRC(3:	movl (%esi), %ebx	)
+EXC(3:	movl (%esi), %ebx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, (%edi)	)
+EXC(	movl %ebx, (%edi)	)
 	lea 4(%esi), %esi
 	lea 4(%edi), %edi
 	dec %edx
@@ -357,39 +346,24 @@ DST(	movl %ebx, (%edi)	)
 	jz 7f
 	cmpl $2, %ecx
 	jb 5f
-SRC(	movw (%esi), %cx	)
+EXC(	movw (%esi), %cx	)
 	leal 2(%esi), %esi
-DST(	movw %cx, (%edi)	)
+EXC(	movw %cx, (%edi)	)
 	leal 2(%edi), %edi
 	je 6f
 	shll $16,%ecx
-SRC(5:	movb (%esi), %cl	)
-DST(	movb %cl, (%edi)	)
+EXC(5:	movb (%esi), %cl	)
+EXC(	movb %cl, (%edi)	)
 6:	addl %ecx, %eax
 	adcl $0, %eax
 7:
-5000:
 
 # Exception handler:
 .section .fixup, "ax"							
 
 6001:
-	movl ARGBASE+20(%esp), %ebx	# src_err_ptr
-	movl $-EFAULT, (%ebx)
-
-	# zero the complete destination - computing the rest
-	# is too much work 
-	movl ARGBASE+8(%esp), %edi	# dst
-	movl ARGBASE+12(%esp), %ecx	# len
-	xorl %eax,%eax
-	rep ; stosb
-
-	jmp 5000b
-
-6002:
-	movl ARGBASE+24(%esp), %ebx	# dst_err_ptr
-	movl $-EFAULT,(%ebx)
-	jmp 5000b
+	xorl %eax, %eax
+	jmp 7b
 
 .previous
 
@@ -405,14 +379,14 @@ SYM_FUNC_END(csum_partial_copy_generic)
 /* Version for PentiumII/PPro */
 
 #define ROUND1(x) \
-	SRC(movl x(%esi), %ebx	)	;	\
+	EXC(movl x(%esi), %ebx	)	;	\
 	addl %ebx, %eax			;	\
-	DST(movl %ebx, x(%edi)	)	; 
+	EXC(movl %ebx, x(%edi)	)	;
 
 #define ROUND(x) \
-	SRC(movl x(%esi), %ebx	)	;	\
+	EXC(movl x(%esi), %ebx	)	;	\
 	adcl %ebx, %eax			;	\
-	DST(movl %ebx, x(%edi)	)	;
+	EXC(movl %ebx, x(%edi)	)	;
 
 #define ARGBASE 12
 		
@@ -423,7 +397,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	movl ARGBASE+4(%esp),%esi	#src
 	movl ARGBASE+8(%esp),%edi	#dst	
 	movl ARGBASE+12(%esp),%ecx	#len
-	movl ARGBASE+16(%esp),%eax	#sum
+	movl $-1, %eax			#sum
 #	movl %ecx, %edx  
 	movl %ecx, %ebx  
 	movl %esi, %edx
@@ -439,7 +413,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	JMP_NOSPEC ebx
 1:	addl $64,%esi
 	addl $64,%edi 
-	SRC(movb -32(%edx),%bl)	; SRC(movb (%edx),%bl)
+	EXC(movb -32(%edx),%bl)	; EXC(movb (%edx),%bl)
 	ROUND1(-64) ROUND(-60) ROUND(-56) ROUND(-52)	
 	ROUND (-48) ROUND(-44) ROUND(-40) ROUND(-36)	
 	ROUND (-32) ROUND(-28) ROUND(-24) ROUND(-20)	
@@ -453,29 +427,20 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	jz 7f
 	cmpl $2, %edx
 	jb 5f
-SRC(	movw (%esi), %dx         )
+EXC(	movw (%esi), %dx         )
 	leal 2(%esi), %esi
-DST(	movw %dx, (%edi)         )
+EXC(	movw %dx, (%edi)         )
 	leal 2(%edi), %edi
 	je 6f
 	shll $16,%edx
 5:
-SRC(	movb (%esi), %dl         )
-DST(	movb %dl, (%edi)         )
+EXC(	movb (%esi), %dl         )
+EXC(	movb %dl, (%edi)         )
 6:	addl %edx, %eax
 	adcl $0, %eax
 7:
 .section .fixup, "ax"
-6001:	movl	ARGBASE+20(%esp), %ebx	# src_err_ptr	
-	movl $-EFAULT, (%ebx)
-	# zero the complete destination (computing the rest is too much work)
-	movl ARGBASE+8(%esp),%edi	# dst
-	movl ARGBASE+12(%esp),%ecx	# len
-	xorl %eax,%eax
-	rep; stosb
-	jmp 7b
-6002:	movl ARGBASE+24(%esp), %ebx	# dst_err_ptr
-	movl $-EFAULT, (%ebx)
+6001:	xorl %eax, %eax
 	jmp  7b			
 .previous				
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 10/18] i386: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-21 20:25   ` [PATCH 10/18] i386: propagate " Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and don't bother zeroing destination on error

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_32.h |  18 ++----
 arch/x86/lib/checksum_32.S         | 117 +++++++++++++------------------------
 2 files changed, 47 insertions(+), 88 deletions(-)

diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 5948cde9e4ad..17da95387997 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -27,9 +27,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 /*
  *	Note: when you get a NULL pointer exception here this means someone
@@ -40,23 +38,21 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  */
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
 					     void *dst, int len)
 {
 	__wsum ret;
-	int err = 0;
 
 	might_sleep();
 	if (!user_access_begin(src, len))
 		return 0;
-	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, ~0U, &err, NULL);
+	ret = csum_partial_copy_generic((__force void *)src, dst, len);
 	user_access_end();
 
-	return err ? 0 : ret;
+	return ret;
 }
 
 /*
@@ -177,16 +173,14 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   int len)
 {
 	__wsum ret;
-	int err = 0;
 
 	might_sleep();
 	if (!user_access_begin(dst, len))
 		return 0;
 
-	ret = csum_partial_copy_generic(src, (__force void *)dst,
-					len, ~0U, NULL, &err);
+	ret = csum_partial_copy_generic(src, (__force void *)dst, len);
 	user_access_end();
-	return err ? 0 : ret;
+	return ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index d1d768912368..4304320e51f4 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -253,28 +253,17 @@ EXPORT_SYMBOL(csum_partial)
 
 /*
 unsigned int csum_partial_copy_generic (const char *src, char *dst,
-				  int len, int sum, int *src_err_ptr, int *dst_err_ptr)
+				  int len)
  */ 
 
 /*
  * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for all access types.
- *
- * FIXME: could someone double-check whether I haven't mixed up some SRC and
- *	  DST definitions? It's damn hard to trigger all cases.  I hope I got
- *	  them all but there's no guarantee.
  */
 
-#define SRC(y...)			\
+#define EXC(y...)			\
 	9999: y;			\
 	_ASM_EXTABLE_UA(9999b, 6001f)
 
-#define DST(y...)			\
-	9999: y;			\
-	_ASM_EXTABLE_UA(9999b, 6002f)
-
 #ifndef CONFIG_X86_USE_PPRO_CHECKSUM
 
 #define ARGBASE 16		
@@ -285,20 +274,20 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	pushl %edi
 	pushl %esi
 	pushl %ebx
-	movl ARGBASE+16(%esp),%eax	# sum
 	movl ARGBASE+12(%esp),%ecx	# len
 	movl ARGBASE+4(%esp),%esi	# src
 	movl ARGBASE+8(%esp),%edi	# dst
 
+	movl $-1, %eax			# sum
 	testl $2, %edi			# Check alignment. 
 	jz 2f				# Jump if alignment is ok.
 	subl $2, %ecx			# Alignment uses up two bytes.
 	jae 1f				# Jump if we had at least two bytes.
 	addl $2, %ecx			# ecx was < 2.  Deal with it.
 	jmp 4f
-SRC(1:	movw (%esi), %bx	)
+EXC(1:	movw (%esi), %bx	)
 	addl $2, %esi
-DST(	movw %bx, (%edi)	)
+EXC(	movw %bx, (%edi)	)
 	addl $2, %edi
 	addw %bx, %ax	
 	adcl $0, %eax
@@ -306,34 +295,34 @@ DST(	movw %bx, (%edi)	)
 	movl %ecx, FP(%esp)
 	shrl $5, %ecx
 	jz 2f
-	testl %esi, %esi
-SRC(1:	movl (%esi), %ebx	)
-SRC(	movl 4(%esi), %edx	)
+	testl %esi, %esi		# what's wrong with clc?
+EXC(1:	movl (%esi), %ebx	)
+EXC(	movl 4(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, (%edi)	)
+EXC(	movl %ebx, (%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 4(%edi)	)
+EXC(	movl %edx, 4(%edi)	)
 
-SRC(	movl 8(%esi), %ebx	)
-SRC(	movl 12(%esi), %edx	)
+EXC(	movl 8(%esi), %ebx	)
+EXC(	movl 12(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 8(%edi)	)
+EXC(	movl %ebx, 8(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 12(%edi)	)
+EXC(	movl %edx, 12(%edi)	)
 
-SRC(	movl 16(%esi), %ebx 	)
-SRC(	movl 20(%esi), %edx	)
+EXC(	movl 16(%esi), %ebx 	)
+EXC(	movl 20(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 16(%edi)	)
+EXC(	movl %ebx, 16(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 20(%edi)	)
+EXC(	movl %edx, 20(%edi)	)
 
-SRC(	movl 24(%esi), %ebx	)
-SRC(	movl 28(%esi), %edx	)
+EXC(	movl 24(%esi), %ebx	)
+EXC(	movl 28(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 24(%edi)	)
+EXC(	movl %ebx, 24(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 28(%edi)	)
+EXC(	movl %edx, 28(%edi)	)
 
 	lea 32(%esi), %esi
 	lea 32(%edi), %edi
@@ -345,9 +334,9 @@ DST(	movl %edx, 28(%edi)	)
 	andl $0x1c, %edx
 	je 4f
 	shrl $2, %edx			# This clears CF
-SRC(3:	movl (%esi), %ebx	)
+EXC(3:	movl (%esi), %ebx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, (%edi)	)
+EXC(	movl %ebx, (%edi)	)
 	lea 4(%esi), %esi
 	lea 4(%edi), %edi
 	dec %edx
@@ -357,39 +346,24 @@ DST(	movl %ebx, (%edi)	)
 	jz 7f
 	cmpl $2, %ecx
 	jb 5f
-SRC(	movw (%esi), %cx	)
+EXC(	movw (%esi), %cx	)
 	leal 2(%esi), %esi
-DST(	movw %cx, (%edi)	)
+EXC(	movw %cx, (%edi)	)
 	leal 2(%edi), %edi
 	je 6f
 	shll $16,%ecx
-SRC(5:	movb (%esi), %cl	)
-DST(	movb %cl, (%edi)	)
+EXC(5:	movb (%esi), %cl	)
+EXC(	movb %cl, (%edi)	)
 6:	addl %ecx, %eax
 	adcl $0, %eax
 7:
-5000:
 
 # Exception handler:
 .section .fixup, "ax"							
 
 6001:
-	movl ARGBASE+20(%esp), %ebx	# src_err_ptr
-	movl $-EFAULT, (%ebx)
-
-	# zero the complete destination - computing the rest
-	# is too much work 
-	movl ARGBASE+8(%esp), %edi	# dst
-	movl ARGBASE+12(%esp), %ecx	# len
-	xorl %eax,%eax
-	rep ; stosb
-
-	jmp 5000b
-
-6002:
-	movl ARGBASE+24(%esp), %ebx	# dst_err_ptr
-	movl $-EFAULT,(%ebx)
-	jmp 5000b
+	xorl %eax, %eax
+	jmp 7b
 
 .previous
 
@@ -405,14 +379,14 @@ SYM_FUNC_END(csum_partial_copy_generic)
 /* Version for PentiumII/PPro */
 
 #define ROUND1(x) \
-	SRC(movl x(%esi), %ebx	)	;	\
+	EXC(movl x(%esi), %ebx	)	;	\
 	addl %ebx, %eax			;	\
-	DST(movl %ebx, x(%edi)	)	; 
+	EXC(movl %ebx, x(%edi)	)	;
 
 #define ROUND(x) \
-	SRC(movl x(%esi), %ebx	)	;	\
+	EXC(movl x(%esi), %ebx	)	;	\
 	adcl %ebx, %eax			;	\
-	DST(movl %ebx, x(%edi)	)	;
+	EXC(movl %ebx, x(%edi)	)	;
 
 #define ARGBASE 12
 		
@@ -423,7 +397,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	movl ARGBASE+4(%esp),%esi	#src
 	movl ARGBASE+8(%esp),%edi	#dst	
 	movl ARGBASE+12(%esp),%ecx	#len
-	movl ARGBASE+16(%esp),%eax	#sum
+	movl $-1, %eax			#sum
 #	movl %ecx, %edx  
 	movl %ecx, %ebx  
 	movl %esi, %edx
@@ -439,7 +413,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	JMP_NOSPEC ebx
 1:	addl $64,%esi
 	addl $64,%edi 
-	SRC(movb -32(%edx),%bl)	; SRC(movb (%edx),%bl)
+	EXC(movb -32(%edx),%bl)	; EXC(movb (%edx),%bl)
 	ROUND1(-64) ROUND(-60) ROUND(-56) ROUND(-52)	
 	ROUND (-48) ROUND(-44) ROUND(-40) ROUND(-36)	
 	ROUND (-32) ROUND(-28) ROUND(-24) ROUND(-20)	
@@ -453,29 +427,20 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	jz 7f
 	cmpl $2, %edx
 	jb 5f
-SRC(	movw (%esi), %dx         )
+EXC(	movw (%esi), %dx         )
 	leal 2(%esi), %esi
-DST(	movw %dx, (%edi)         )
+EXC(	movw %dx, (%edi)         )
 	leal 2(%edi), %edi
 	je 6f
 	shll $16,%edx
 5:
-SRC(	movb (%esi), %dl         )
-DST(	movb %dl, (%edi)         )
+EXC(	movb (%esi), %dl         )
+EXC(	movb %dl, (%edi)         )
 6:	addl %edx, %eax
 	adcl $0, %eax
 7:
 .section .fixup, "ax"
-6001:	movl	ARGBASE+20(%esp), %ebx	# src_err_ptr	
-	movl $-EFAULT, (%ebx)
-	# zero the complete destination (computing the rest is too much work)
-	movl ARGBASE+8(%esp),%edi	# dst
-	movl ARGBASE+12(%esp),%ecx	# len
-	xorl %eax,%eax
-	rep; stosb
-	jmp 7b
-6002:	movl ARGBASE+24(%esp), %ebx	# dst_err_ptr
-	movl $-EFAULT, (%ebx)
+6001:	xorl %eax, %eax
 	jmp  7b			
 .previous				
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (8 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 10/18] i386: propagate " Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-22  1:20     ` David Miller
  2020-07-21 20:25   ` [PATCH 12/18] mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS Al Viro
                     ` (6 subsequent siblings)
  16 siblings, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing the target, etc. on fault.
All exception handlers merge into one; moreover, since we are not
calling lookup_fault() anymore, we don't need the magic with passing
arguments for it from the page fault handler.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum_32.h |  49 +--------
 arch/sparc/lib/checksum_32.S         | 202 +++++++----------------------------
 arch/sparc/mm/fault_32.c             |   6 +-
 3 files changed, 44 insertions(+), 213 deletions(-)

diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index b5873b7b7bf0..d55e480172a6 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -50,9 +50,9 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 	__asm__ __volatile__ (
 		"call __csum_partial_copy_sparc_generic\n\t"
-		" mov %6, %%g7\n"
+		" mov -1, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (0)
+	: "0" (ret), "1" (d), "2" (l)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
@@ -61,29 +61,10 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src, void *dst, int len)
-  {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
+{
 	if (unlikely(!access_ok(src, len)))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,2\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
 #define HAVE_CSUM_COPY_USER
@@ -91,29 +72,9 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char __user *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,1\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7",
-	  "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck(src, (__force void *)dst, len);
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/lib/checksum_32.S b/arch/sparc/lib/checksum_32.S
index 6a5469c97246..7488d130faf7 100644
--- a/arch/sparc/lib/checksum_32.S
+++ b/arch/sparc/lib/checksum_32.S
@@ -144,44 +144,21 @@ cpte:	bne	csum_partial_end_cruft			! yep, handle it
 cpout:	retl						! get outta here
 	 mov	%o2, %o0				! return computed csum
 
-	.globl __csum_partial_copy_start, __csum_partial_copy_end
-__csum_partial_copy_start:
-
 /* Work around cpp -rob */
 #define ALLOC #alloc
 #define EXECINSTR #execinstr
-#define EX(x,y,a,b)				\
-98:     x,y;                                    \
-        .section .fixup,ALLOC,EXECINSTR;	\
-        .align  4;                              \
-99:     ba 30f;                                 \
-         a, b, %o3;                             \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 99b;                       \
-        .text;                                  \
-        .align  4
-
-#define EX2(x,y)				\
-98:     x,y;                                    \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 30f;                       \
-        .text;                                  \
-        .align  4
-
-#define EX3(x,y)				\
+#define EX(x,y)					\
 98:     x,y;                                    \
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   98b, 96f;                       \
+        .word   98b, cc_fault;                   \
         .text;                                  \
         .align  4
 
-#define EXT(start,end,handler)			\
+#define EXT(start,end)				\
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   start, 0, end, handler;         \
+        .word   start, 0, end, cc_fault;         \
         .text;                                  \
         .align  4
 
@@ -252,21 +229,21 @@ __csum_partial_copy_start:
 cc_end_cruft:
 	be	1f
 	 andcc	%o3, 4, %g0
-	EX(ldd	[%o0 + 0x00], %g2, and %o3, 0xf)
+	EX(ldd	[%o0 + 0x00], %g2)
 	add	%o1, 8, %o1
 	addcc	%g2, %g7, %g7
 	add	%o0, 8, %o0
 	addxcc	%g3, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x08])
+	EX(st	%g2, [%o1 - 0x08])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 4, %g0
-	EX2(st	%g3, [%o1 - 0x04])
+	EX(st	%g3, [%o1 - 0x04])
 1:	be	1f
 	 andcc	%o3, 3, %o3
-	EX(ld	[%o0 + 0x00], %g2, add %o3, 4)
+	EX(ld	[%o0 + 0x00], %g2)
 	add	%o1, 4, %o1
 	addcc	%g2, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x04])
+	EX(st	%g2, [%o1 - 0x04])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 3, %g0
 	add	%o0, 4, %o0
@@ -276,14 +253,14 @@ cc_end_cruft:
 	 subcc	%o3, 2, %o3
 	b	4f
 	 or	%g0, %g0, %o4
-2:	EX(lduh	[%o0 + 0x00], %o4, add %o3, 2)
+2:	EX(lduh	[%o0 + 0x00], %o4)
 	add	%o0, 2, %o0
-	EX2(sth	%o4, [%o1 + 0x00])
+	EX(sth	%o4, [%o1 + 0x00])
 	be	6f
 	 add	%o1, 2, %o1
 	sll	%o4, 16, %o4
-4:	EX(ldub	[%o0 + 0x00], %o5, add %g0, 1)
-	EX2(stb	%o5, [%o1 + 0x00])
+4:	EX(ldub	[%o0 + 0x00], %o5)
+	EX(stb	%o5, [%o1 + 0x00])
 	sll	%o5, 8, %o5
 	or	%o5, %o4, %o4
 6:	addcc	%o4, %g7, %g7
@@ -306,9 +283,9 @@ cc_dword_align:
 	 andcc	%o0, 0x2, %g0
 	be	1f
 	 andcc	%o0, 0x4, %g0
-	EX(lduh	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(lduh	[%o0 + 0x00], %g4)
 	sub	%g1, 2, %g1
-	EX2(sth	%g4, [%o1 + 0x00])
+	EX(sth	%g4, [%o1 + 0x00])
 	add	%o0, 2, %o0
 	sll	%g4, 16, %g4
 	addcc	%g4, %g7, %g7
@@ -322,9 +299,9 @@ cc_dword_align:
 	or	%g3, %g7, %g7
 1:	be	3f
 	 andcc	%g1, 0xffffff80, %g0
-	EX(ld	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(ld	[%o0 + 0x00], %g4)
 	sub	%g1, 4, %g1
-	EX2(st	%g4, [%o1 + 0x00])
+	EX(st	%g4, [%o1 + 0x00])
 	add	%o0, 4, %o0
 	addcc	%g4, %g7, %g7
 	add	%o1, 4, %o1
@@ -354,7 +331,7 @@ __csum_partial_copy_sparc_generic:
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-10:	EXT(5b, 10b, 20f)		! note for exception handling
+10:	EXT(5b, 10b)			! note for exception handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -379,7 +356,7 @@ cctbl:	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x68,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x28,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x18,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x08,%g2,%g3,%g4,%g5)
-12:	EXT(cctbl, 12b, 22f)		! note for exception table handling
+12:	EXT(cctbl, 12b)			! note for exception table handling
 	addx	%g0, %g7, %g7
 	andcc	%o3, 0xf, %g0		! check for low bits set
 ccte:	bne	cc_end_cruft		! something left, handle it out of band
@@ -390,7 +367,7 @@ ccdbl:	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x00,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-11:	EXT(ccdbl, 11b, 21f)		! note for exception table handling
+11:	EXT(ccdbl, 11b)			! note for exception table handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -407,9 +384,9 @@ ccslow:	cmp	%g1, 0
 	be,a	1f
 	 srl	%g1, 1, %g4		
 	sub	%g1, 1, %g1	
-	EX(ldub	[%o0], %g5, add %g1, 1)
+	EX(ldub	[%o0], %g5)
 	add	%o0, 1, %o0	
-	EX2(stb	%g5, [%o1])
+	EX(stb	%g5, [%o1])
 	srl	%g1, 1, %g4
 	add	%o1, 1, %o1
 1:	cmp	%g4, 0		
@@ -418,34 +395,34 @@ ccslow:	cmp	%g1, 0
 	andcc	%o0, 2, %g0	
 	be,a	1f
 	 srl	%g4, 1, %g4
-	EX(lduh	[%o0], %o4, add %g1, 0)
+	EX(lduh	[%o0], %o4)
 	sub	%g1, 2, %g1	
 	srl	%o4, 8, %g2
 	sub	%g4, 1, %g4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o0, 2, %o0	
 	srl	%g4, 1, %g4
 	add	%o1, 2, %o1
 1:	cmp	%g4, 0		
 	be,a	2f
 	 andcc	%g1, 2, %g0
-	EX3(ld	[%o0], %o4)
+	EX(ld	[%o0], %o4)
 5:	srl	%o4, 24, %g2
 	srl	%o4, 16, %g3
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	srl	%o4, 8, %g2
-	EX2(stb	%g3, [%o1 + 1])
+	EX(stb	%g3, [%o1 + 1])
 	add	%o0, 4, %o0
-	EX2(stb	%g2, [%o1 + 2])
+	EX(stb	%g2, [%o1 + 2])
 	addcc	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 3])
+	EX(stb	%o4, [%o1 + 3])
 	addx	%g5, %g0, %g5	! I am now to lazy to optimize this (question it
 	add	%o1, 4, %o1	! is worthy). Maybe some day - with the sll/srl
 	subcc	%g4, 1, %g4	! tricks
 	bne,a	5b
-	 EX3(ld	[%o0], %o4)
+	 EX(ld	[%o0], %o4)
 	sll	%g5, 16, %g2
 	srl	%g5, 16, %g5
 	srl	%g2, 16, %g2
@@ -453,19 +430,19 @@ ccslow:	cmp	%g1, 0
 	add	%g2, %g5, %g5 
 2:	be,a	3f		
 	 andcc	%g1, 1, %g0
-	EX(lduh	[%o0], %o4, and %g1, 3)
+	EX(lduh	[%o0], %o4)
 	andcc	%g1, 1, %g0
 	srl	%o4, 8, %g2
 	add	%o0, 2, %o0	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o1, 2, %o1
 3:	be,a	1f		
 	 sll	%g5, 16, %o4
-	EX(ldub	[%o0], %g2, add %g0, 1)
+	EX(ldub	[%o0], %g2)
 	sll	%g2, 8, %o4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
 	sll	%g5, 16, %o4
 1:	addcc	%o4, %g5, %g5
@@ -481,113 +458,10 @@ ccslow:	cmp	%g1, 0
 4:	addcc	%g7, %g5, %g7
 	retl	
 	 addx	%g0, %g7, %o0
-__csum_partial_copy_end:
 
 /* We do these strange calculations for the csum_*_from_user case only, ie.
  * we only bother with faults on loads... */
 
-/* o2 = ((g2%20)&3)*8
- * o3 = g1 - (g2/20)*32 - o2 */
-20:
-	cmp	%g2, 20
-	blu,a	1f
-	 and	%g2, 3, %o2
-	sub	%g1, 32, %g1
-	b	20b
-	 sub	%g2, 20, %g2
-1:
-	sll	%o2, 3, %o2
-	b	31f
-	 sub	%g1, %o2, %o3
-
-/* o2 = (!(g2 & 15) ? 0 : (((g2 & 15) + 1) & ~1)*8)
- * o3 = g1 - (g2/16)*32 - o2 */
-21:
-	andcc	%g2, 15, %o3
-	srl	%g2, 4, %g2
-	be,a	1f
-	 clr	%o2
-	add	%o3, 1, %o3
-	and	%o3, 14, %o3
-	sll	%o3, 3, %o2
-1:
-	sll	%g2, 5, %g2
-	sub	%g1, %g2, %o3
-	b	31f
-	 sub	%o3, %o2, %o3
-
-/* o0 += (g2/10)*16 - 0x70
- * 01 += (g2/10)*16 - 0x70
- * o2 = (g2 % 10) ? 8 : 0
- * o3 += 0x70 - (g2/10)*16 - o2 */
-22:
-	cmp	%g2, 10
-	blu,a	1f
-	 sub	%o0, 0x70, %o0
-	add	%o0, 16, %o0
-	add	%o1, 16, %o1
-	sub	%o3, 16, %o3
-	b	22b
-	 sub	%g2, 10, %g2
-1:
-	sub	%o1, 0x70, %o1
-	add	%o3, 0x70, %o3
-	clr	%o2
-	tst	%g2
-	bne,a	1f
-	 mov	8, %o2
-1:
-	b	31f
-	 sub	%o3, %o2, %o3
-96:
-	and	%g1, 3, %g1
-	sll	%g4, 2, %g4
-	add	%g1, %g4, %o3
-30:
-/* %o1 is dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	clr	%o2
-31:
-/* %o0 is src
- * %o1 is dst
- * %o2 is # of bytes to copy from src to dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	save	%sp, -104, %sp
-        mov     %i5, %o0
-        mov     %i7, %o1
-        mov	%i4, %o2
-        call    lookup_fault
-	 mov	%g7, %i4
-	cmp	%o0, 2
-	bne	1f	
-	 add	%g0, -EFAULT, %i5
-	tst	%i2
-	be	2f
-	 mov	%i0, %o1
-	mov	%i1, %o0
-5:
-	call	memcpy
-	 mov	%i2, %o2
-	tst	%o0
-	bne,a	2f
-	 add	%i3, %i2, %i3
-	add	%i1, %i2, %i1
-2:
-	mov	%i1, %o0
-6:
-	call	__bzero
-	 mov	%i3, %o1
-1:
-	ld	[%sp + 168], %o2		! struct_ptr of parent
-	st	%i5, [%o2]
+cc_fault:
 	ret
-	 restore
-
-        .section __ex_table,#alloc
-        .align 4
-        .word 5b,2
-	.word 6b,2
+	 clr	%o0
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index cfef656eda0f..1185b6169144 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -297,8 +297,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 		if (fixup > 10) {
 			extern const unsigned int __memset_start[];
 			extern const unsigned int __memset_end[];
-			extern const unsigned int __csum_partial_copy_start[];
-			extern const unsigned int __csum_partial_copy_end[];
 
 #ifdef DEBUG_EXCEPTIONS
 			printk("Exception: PC<%08lx> faddr<%08lx>\n",
@@ -307,9 +305,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 				regs->pc, fixup, g2);
 #endif
 			if ((regs->pc >= (unsigned long)__memset_start &&
-			     regs->pc < (unsigned long)__memset_end) ||
-			    (regs->pc >= (unsigned long)__csum_partial_copy_start &&
-			     regs->pc < (unsigned long)__csum_partial_copy_end)) {
+			     regs->pc < (unsigned long)__memset_end)) {
 				regs->u_regs[UREG_I4] = address;
 				regs->u_regs[UREG_I5] = regs->pc;
 			}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  2020-07-21 20:25   ` [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
@ 2020-07-21 20:25     ` Al Viro
  2020-07-22  1:20     ` David Miller
  1 sibling, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing the target, etc. on fault.
All exception handlers merge into one; moreover, since we are not
calling lookup_fault() anymore, we don't need the magic with passing
arguments for it from the page fault handler.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum_32.h |  49 +--------
 arch/sparc/lib/checksum_32.S         | 202 +++++++----------------------------
 arch/sparc/mm/fault_32.c             |   6 +-
 3 files changed, 44 insertions(+), 213 deletions(-)

diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index b5873b7b7bf0..d55e480172a6 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -50,9 +50,9 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 	__asm__ __volatile__ (
 		"call __csum_partial_copy_sparc_generic\n\t"
-		" mov %6, %%g7\n"
+		" mov -1, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (0)
+	: "0" (ret), "1" (d), "2" (l)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
@@ -61,29 +61,10 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src, void *dst, int len)
-  {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
+{
 	if (unlikely(!access_ok(src, len)))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,2\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
 #define HAVE_CSUM_COPY_USER
@@ -91,29 +72,9 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char __user *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,1\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7",
-	  "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck(src, (__force void *)dst, len);
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/lib/checksum_32.S b/arch/sparc/lib/checksum_32.S
index 6a5469c97246..7488d130faf7 100644
--- a/arch/sparc/lib/checksum_32.S
+++ b/arch/sparc/lib/checksum_32.S
@@ -144,44 +144,21 @@ cpte:	bne	csum_partial_end_cruft			! yep, handle it
 cpout:	retl						! get outta here
 	 mov	%o2, %o0				! return computed csum
 
-	.globl __csum_partial_copy_start, __csum_partial_copy_end
-__csum_partial_copy_start:
-
 /* Work around cpp -rob */
 #define ALLOC #alloc
 #define EXECINSTR #execinstr
-#define EX(x,y,a,b)				\
-98:     x,y;                                    \
-        .section .fixup,ALLOC,EXECINSTR;	\
-        .align  4;                              \
-99:     ba 30f;                                 \
-         a, b, %o3;                             \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 99b;                       \
-        .text;                                  \
-        .align  4
-
-#define EX2(x,y)				\
-98:     x,y;                                    \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 30f;                       \
-        .text;                                  \
-        .align  4
-
-#define EX3(x,y)				\
+#define EX(x,y)					\
 98:     x,y;                                    \
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   98b, 96f;                       \
+        .word   98b, cc_fault;                   \
         .text;                                  \
         .align  4
 
-#define EXT(start,end,handler)			\
+#define EXT(start,end)				\
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   start, 0, end, handler;         \
+        .word   start, 0, end, cc_fault;         \
         .text;                                  \
         .align  4
 
@@ -252,21 +229,21 @@ __csum_partial_copy_start:
 cc_end_cruft:
 	be	1f
 	 andcc	%o3, 4, %g0
-	EX(ldd	[%o0 + 0x00], %g2, and %o3, 0xf)
+	EX(ldd	[%o0 + 0x00], %g2)
 	add	%o1, 8, %o1
 	addcc	%g2, %g7, %g7
 	add	%o0, 8, %o0
 	addxcc	%g3, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x08])
+	EX(st	%g2, [%o1 - 0x08])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 4, %g0
-	EX2(st	%g3, [%o1 - 0x04])
+	EX(st	%g3, [%o1 - 0x04])
 1:	be	1f
 	 andcc	%o3, 3, %o3
-	EX(ld	[%o0 + 0x00], %g2, add %o3, 4)
+	EX(ld	[%o0 + 0x00], %g2)
 	add	%o1, 4, %o1
 	addcc	%g2, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x04])
+	EX(st	%g2, [%o1 - 0x04])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 3, %g0
 	add	%o0, 4, %o0
@@ -276,14 +253,14 @@ cc_end_cruft:
 	 subcc	%o3, 2, %o3
 	b	4f
 	 or	%g0, %g0, %o4
-2:	EX(lduh	[%o0 + 0x00], %o4, add %o3, 2)
+2:	EX(lduh	[%o0 + 0x00], %o4)
 	add	%o0, 2, %o0
-	EX2(sth	%o4, [%o1 + 0x00])
+	EX(sth	%o4, [%o1 + 0x00])
 	be	6f
 	 add	%o1, 2, %o1
 	sll	%o4, 16, %o4
-4:	EX(ldub	[%o0 + 0x00], %o5, add %g0, 1)
-	EX2(stb	%o5, [%o1 + 0x00])
+4:	EX(ldub	[%o0 + 0x00], %o5)
+	EX(stb	%o5, [%o1 + 0x00])
 	sll	%o5, 8, %o5
 	or	%o5, %o4, %o4
 6:	addcc	%o4, %g7, %g7
@@ -306,9 +283,9 @@ cc_dword_align:
 	 andcc	%o0, 0x2, %g0
 	be	1f
 	 andcc	%o0, 0x4, %g0
-	EX(lduh	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(lduh	[%o0 + 0x00], %g4)
 	sub	%g1, 2, %g1
-	EX2(sth	%g4, [%o1 + 0x00])
+	EX(sth	%g4, [%o1 + 0x00])
 	add	%o0, 2, %o0
 	sll	%g4, 16, %g4
 	addcc	%g4, %g7, %g7
@@ -322,9 +299,9 @@ cc_dword_align:
 	or	%g3, %g7, %g7
 1:	be	3f
 	 andcc	%g1, 0xffffff80, %g0
-	EX(ld	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(ld	[%o0 + 0x00], %g4)
 	sub	%g1, 4, %g1
-	EX2(st	%g4, [%o1 + 0x00])
+	EX(st	%g4, [%o1 + 0x00])
 	add	%o0, 4, %o0
 	addcc	%g4, %g7, %g7
 	add	%o1, 4, %o1
@@ -354,7 +331,7 @@ __csum_partial_copy_sparc_generic:
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-10:	EXT(5b, 10b, 20f)		! note for exception handling
+10:	EXT(5b, 10b)			! note for exception handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -379,7 +356,7 @@ cctbl:	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x68,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x28,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x18,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x08,%g2,%g3,%g4,%g5)
-12:	EXT(cctbl, 12b, 22f)		! note for exception table handling
+12:	EXT(cctbl, 12b)			! note for exception table handling
 	addx	%g0, %g7, %g7
 	andcc	%o3, 0xf, %g0		! check for low bits set
 ccte:	bne	cc_end_cruft		! something left, handle it out of band
@@ -390,7 +367,7 @@ ccdbl:	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x00,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-11:	EXT(ccdbl, 11b, 21f)		! note for exception table handling
+11:	EXT(ccdbl, 11b)			! note for exception table handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -407,9 +384,9 @@ ccslow:	cmp	%g1, 0
 	be,a	1f
 	 srl	%g1, 1, %g4		
 	sub	%g1, 1, %g1	
-	EX(ldub	[%o0], %g5, add %g1, 1)
+	EX(ldub	[%o0], %g5)
 	add	%o0, 1, %o0	
-	EX2(stb	%g5, [%o1])
+	EX(stb	%g5, [%o1])
 	srl	%g1, 1, %g4
 	add	%o1, 1, %o1
 1:	cmp	%g4, 0		
@@ -418,34 +395,34 @@ ccslow:	cmp	%g1, 0
 	andcc	%o0, 2, %g0	
 	be,a	1f
 	 srl	%g4, 1, %g4
-	EX(lduh	[%o0], %o4, add %g1, 0)
+	EX(lduh	[%o0], %o4)
 	sub	%g1, 2, %g1	
 	srl	%o4, 8, %g2
 	sub	%g4, 1, %g4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o0, 2, %o0	
 	srl	%g4, 1, %g4
 	add	%o1, 2, %o1
 1:	cmp	%g4, 0		
 	be,a	2f
 	 andcc	%g1, 2, %g0
-	EX3(ld	[%o0], %o4)
+	EX(ld	[%o0], %o4)
 5:	srl	%o4, 24, %g2
 	srl	%o4, 16, %g3
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	srl	%o4, 8, %g2
-	EX2(stb	%g3, [%o1 + 1])
+	EX(stb	%g3, [%o1 + 1])
 	add	%o0, 4, %o0
-	EX2(stb	%g2, [%o1 + 2])
+	EX(stb	%g2, [%o1 + 2])
 	addcc	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 3])
+	EX(stb	%o4, [%o1 + 3])
 	addx	%g5, %g0, %g5	! I am now to lazy to optimize this (question it
 	add	%o1, 4, %o1	! is worthy). Maybe some day - with the sll/srl
 	subcc	%g4, 1, %g4	! tricks
 	bne,a	5b
-	 EX3(ld	[%o0], %o4)
+	 EX(ld	[%o0], %o4)
 	sll	%g5, 16, %g2
 	srl	%g5, 16, %g5
 	srl	%g2, 16, %g2
@@ -453,19 +430,19 @@ ccslow:	cmp	%g1, 0
 	add	%g2, %g5, %g5 
 2:	be,a	3f		
 	 andcc	%g1, 1, %g0
-	EX(lduh	[%o0], %o4, and %g1, 3)
+	EX(lduh	[%o0], %o4)
 	andcc	%g1, 1, %g0
 	srl	%o4, 8, %g2
 	add	%o0, 2, %o0	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o1, 2, %o1
 3:	be,a	1f		
 	 sll	%g5, 16, %o4
-	EX(ldub	[%o0], %g2, add %g0, 1)
+	EX(ldub	[%o0], %g2)
 	sll	%g2, 8, %o4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
 	sll	%g5, 16, %o4
 1:	addcc	%o4, %g5, %g5
@@ -481,113 +458,10 @@ ccslow:	cmp	%g1, 0
 4:	addcc	%g7, %g5, %g7
 	retl	
 	 addx	%g0, %g7, %o0
-__csum_partial_copy_end:
 
 /* We do these strange calculations for the csum_*_from_user case only, ie.
  * we only bother with faults on loads... */
 
-/* o2 = ((g2%20)&3)*8
- * o3 = g1 - (g2/20)*32 - o2 */
-20:
-	cmp	%g2, 20
-	blu,a	1f
-	 and	%g2, 3, %o2
-	sub	%g1, 32, %g1
-	b	20b
-	 sub	%g2, 20, %g2
-1:
-	sll	%o2, 3, %o2
-	b	31f
-	 sub	%g1, %o2, %o3
-
-/* o2 = (!(g2 & 15) ? 0 : (((g2 & 15) + 1) & ~1)*8)
- * o3 = g1 - (g2/16)*32 - o2 */
-21:
-	andcc	%g2, 15, %o3
-	srl	%g2, 4, %g2
-	be,a	1f
-	 clr	%o2
-	add	%o3, 1, %o3
-	and	%o3, 14, %o3
-	sll	%o3, 3, %o2
-1:
-	sll	%g2, 5, %g2
-	sub	%g1, %g2, %o3
-	b	31f
-	 sub	%o3, %o2, %o3
-
-/* o0 += (g2/10)*16 - 0x70
- * 01 += (g2/10)*16 - 0x70
- * o2 = (g2 % 10) ? 8 : 0
- * o3 += 0x70 - (g2/10)*16 - o2 */
-22:
-	cmp	%g2, 10
-	blu,a	1f
-	 sub	%o0, 0x70, %o0
-	add	%o0, 16, %o0
-	add	%o1, 16, %o1
-	sub	%o3, 16, %o3
-	b	22b
-	 sub	%g2, 10, %g2
-1:
-	sub	%o1, 0x70, %o1
-	add	%o3, 0x70, %o3
-	clr	%o2
-	tst	%g2
-	bne,a	1f
-	 mov	8, %o2
-1:
-	b	31f
-	 sub	%o3, %o2, %o3
-96:
-	and	%g1, 3, %g1
-	sll	%g4, 2, %g4
-	add	%g1, %g4, %o3
-30:
-/* %o1 is dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	clr	%o2
-31:
-/* %o0 is src
- * %o1 is dst
- * %o2 is # of bytes to copy from src to dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	save	%sp, -104, %sp
-        mov     %i5, %o0
-        mov     %i7, %o1
-        mov	%i4, %o2
-        call    lookup_fault
-	 mov	%g7, %i4
-	cmp	%o0, 2
-	bne	1f	
-	 add	%g0, -EFAULT, %i5
-	tst	%i2
-	be	2f
-	 mov	%i0, %o1
-	mov	%i1, %o0
-5:
-	call	memcpy
-	 mov	%i2, %o2
-	tst	%o0
-	bne,a	2f
-	 add	%i3, %i2, %i3
-	add	%i1, %i2, %i1
-2:
-	mov	%i1, %o0
-6:
-	call	__bzero
-	 mov	%i3, %o1
-1:
-	ld	[%sp + 168], %o2		! struct_ptr of parent
-	st	%i5, [%o2]
+cc_fault:
 	ret
-	 restore
-
-        .section __ex_table,#alloc
-        .align 4
-        .word 5b,2
-	.word 6b,2
+	 clr	%o0
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index cfef656eda0f..1185b6169144 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -297,8 +297,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 		if (fixup > 10) {
 			extern const unsigned int __memset_start[];
 			extern const unsigned int __memset_end[];
-			extern const unsigned int __csum_partial_copy_start[];
-			extern const unsigned int __csum_partial_copy_end[];
 
 #ifdef DEBUG_EXCEPTIONS
 			printk("Exception: PC<%08lx> faddr<%08lx>\n",
@@ -307,9 +305,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 				regs->pc, fixup, g2);
 #endif
 			if ((regs->pc >= (unsigned long)__memset_start &&
-			     regs->pc < (unsigned long)__memset_end) ||
-			    (regs->pc >= (unsigned long)__csum_partial_copy_start &&
-			     regs->pc < (unsigned long)__csum_partial_copy_end)) {
+			     regs->pc < (unsigned long)__memset_end)) {
 				regs->u_regs[UREG_I4] = address;
 				regs->u_regs[UREG_I5] = regs->pc;
 			}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 12/18] mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (9 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25   ` [PATCH 13/18] mips: __csum_partial_copy_kernel() has no users left Al Viro
                     ` (5 subsequent siblings)
  16 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

they are only called for iovec-backed iov_iter and under KERNEL_DS an
attempt to create such a beast will yield a kvec-backed one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h | 32 +++++++-------------------------
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 1e5558f90126..7a5c97c5d705 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -41,22 +41,6 @@ __wsum __csum_partial_copy_from_user(const void *src, void *dst,
 				     int len, __wsum sum, int *err_ptr);
 __wsum __csum_partial_copy_to_user(const void *src, void *dst,
 				   int len, __wsum sum, int *err_ptr);
-/*
- * this is a new version of the above that records errors it finds in *errp,
- * but continues and zeros the rest of the buffer.
- */
-static inline
-__wsum csum_partial_copy_from_user(const void __user *src, void *dst, int len,
-				   __wsum sum, int *err_ptr)
-{
-	might_fault();
-	if (uaccess_kernel())
-		return __csum_partial_copy_kernel((__force void *)src, dst,
-						  len, sum, err_ptr);
-	else
-		return __csum_partial_copy_from_user((__force void *)src, dst,
-						     len, sum, err_ptr);
-}
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
@@ -65,9 +49,12 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 	__wsum sum = ~0U;
 	int err = 0;
 
+	might_fault();
+
 	if (!access_ok(src, len))
 		return 0;
-	sum = csum_partial_copy_from_user(src, dst, len, sum, &err);
+	sum = __csum_partial_copy_from_user((__force void *)src, dst,
+						     len, sum, &err);
 	return err ? 0 : sum;
 }
 
@@ -84,14 +71,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 	might_fault();
 	if (!access_ok(dst, len))
 		return 0;
-	if (uaccess_kernel())
-		sum = __csum_partial_copy_kernel(src,
-						  (__force void *)dst,
-						  len, sum, &err);
-	else
-		sum = __csum_partial_copy_to_user(src,
-						   (__force void *)dst,
-						   len, sum, &err);
+	sum = __csum_partial_copy_to_user(src,
+					   (__force void *)dst,
+					   len, sum, &err);
 	return err ? 0 : sum;
 }
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 13/18] mips: __csum_partial_copy_kernel() has no users left
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (10 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 12/18] mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 14/18] mips: propagate the calling convention change down into __csum_partial_copy_..._user() Al Viro
                     ` (4 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h | 3 ---
 arch/mips/lib/csum_partial.S     | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 7a5c97c5d705..bf02d2d3869f 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,9 +34,6 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_kernel(const void *src, void *dst,
-				  int len, __wsum sum, int *err_ptr);
-
 __wsum __csum_partial_copy_from_user(const void *src, void *dst,
 				     int len, __wsum sum, int *err_ptr);
 __wsum __csum_partial_copy_to_user(const void *src, void *dst,
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 8d70855b0914..983e909c2052 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -827,8 +827,6 @@ EXPORT_SYMBOL(csum_partial)
 	.set	pop
 	.endm
 
-LEAF(__csum_partial_copy_kernel)
-EXPORT_SYMBOL(__csum_partial_copy_kernel)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
@@ -836,7 +834,6 @@ FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
 __BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
-END(__csum_partial_copy_kernel)
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 13/18] mips: __csum_partial_copy_kernel() has no users left
  2020-07-21 20:25   ` [PATCH 13/18] mips: __csum_partial_copy_kernel() has no users left Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h | 3 ---
 arch/mips/lib/csum_partial.S     | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 7a5c97c5d705..bf02d2d3869f 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,9 +34,6 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_kernel(const void *src, void *dst,
-				  int len, __wsum sum, int *err_ptr);
-
 __wsum __csum_partial_copy_from_user(const void *src, void *dst,
 				     int len, __wsum sum, int *err_ptr);
 __wsum __csum_partial_copy_to_user(const void *src, void *dst,
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 8d70855b0914..983e909c2052 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -827,8 +827,6 @@ EXPORT_SYMBOL(csum_partial)
 	.set	pop
 	.endm
 
-LEAF(__csum_partial_copy_kernel)
-EXPORT_SYMBOL(__csum_partial_copy_kernel)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
@@ -836,7 +834,6 @@ FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
 __BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
-END(__csum_partial_copy_kernel)
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 14/18] mips: propagate the calling convention change down into __csum_partial_copy_..._user()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (11 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 13/18] mips: __csum_partial_copy_kernel() has no users left Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 15/18] xtensa: propagate the calling conventions change down into csum_partial_copy_generic() Al Viro
                     ` (3 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

and turn the exception handlers into simply returning 0, which
simplifies the hell out of things in csum_partial.S

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h |  26 +---
 arch/mips/lib/csum_partial.S     | 258 +++++++++++++--------------------------
 2 files changed, 89 insertions(+), 195 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index bf02d2d3869f..66a86a33339a 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,25 +34,17 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_from_user(const void *src, void *dst,
-				     int len, __wsum sum, int *err_ptr);
-__wsum __csum_partial_copy_to_user(const void *src, void *dst,
-				   int len, __wsum sum, int *err_ptr);
+__wsum __csum_partial_copy_from_user(const void __user *src, void *dst, int len);
+__wsum __csum_partial_copy_to_user(const void *src, void __user *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	__wsum sum = ~0U;
-	int err = 0;
-
 	might_fault();
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = __csum_partial_copy_from_user((__force void *)src, dst,
-						     len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_from_user(src, dst, len);
 }
 
 /*
@@ -62,26 +54,20 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	might_fault();
 	if (!access_ok(dst, len))
 		return 0;
-	sum = __csum_partial_copy_to_user(src,
-					   (__force void *)dst,
-					   len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_to_user(src, dst, len);
 }
 
 /*
  * the same as csum_partial, but copies from user space (but on MIPS
  * we have just one address space, so this is identical to the above)
  */
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len);
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
+	return __csum_partial_copy_nocheck(src, dst, len);
 }
 #define csum_partial_copy_nocheck csum_partial_copy_nocheck
 
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 983e909c2052..a46db0807195 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -308,8 +308,8 @@ EXPORT_SYMBOL(csum_partial)
 /*
  * checksum and copy routines based on memcpy.S
  *
- *	csum_partial_copy_nocheck(src, dst, len, sum)
- *	__csum_partial_copy_kernel(src, dst, len, sum, errp)
+ *	csum_partial_copy_nocheck(src, dst, len)
+ *	__csum_partial_copy_kernel(src, dst, len)
  *
  * See "Spec" in memcpy.S for details.	Unlike __copy_user, all
  * function in this file use the standard calling convention.
@@ -318,26 +318,11 @@ EXPORT_SYMBOL(csum_partial)
 #define src a0
 #define dst a1
 #define len a2
-#define psum a3
 #define sum v0
 #define odd t8
-#define errptr t9
 
 /*
- * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
- *     of the copy,
- *  2- src_entry <= src < AT, and
- *  3- (dst - src) == (dst_entry - src_entry),
- * The _entry suffix denotes values when __copy_user was called.
- *
- * (1) is set up up by __csum_partial_copy_from_user and maintained by
- *	not writing AT in __csum_partial_copy
- * (2) is met by incrementing src by the number of bytes copied
- * (3) is met by not doing loads between a pair of increments of dst and src
- *
- * The exception handlers for stores stores -EFAULT to errptr and return.
- * These handlers do not need to overwrite any data.
+ * All exception handlers simply return 0.
  */
 
 /* Instruction type */
@@ -358,11 +343,11 @@ EXPORT_SYMBOL(csum_partial)
  * addr    : Address
  * handler : Exception handler
  */
-#define EXC(insn, type, reg, addr, handler)	\
+#define EXC(insn, type, reg, addr)		\
 	.if \mode == LEGACY_MODE;		\
 9:		insn reg, addr;			\
 		.section __ex_table,"a";	\
-		PTR	9b, handler;		\
+		PTR	9b, .L_exc;		\
 		.previous;			\
 	/* This is enabled in EVA mode */	\
 	.else;					\
@@ -371,7 +356,7 @@ EXPORT_SYMBOL(csum_partial)
 		    ((\to == USEROP) && (type == ST_INSN));	\
 9:			__BUILD_EVA_INSN(insn##e, reg, addr);	\
 			.section __ex_table,"a";		\
-			PTR	9b, handler;			\
+			PTR	9b, .L_exc;			\
 			.previous;				\
 		.else;						\
 			/* EVA without exception */		\
@@ -384,14 +369,14 @@ EXPORT_SYMBOL(csum_partial)
 #ifdef USE_DOUBLE
 
 #define LOADK	ld /* No exception */
-#define LOAD(reg, addr, handler)	EXC(ld, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(ldl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(ldr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(sdl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(sdr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sd, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(ld, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(ldl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(ldr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(sdl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(sdr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sd, ST_INSN, reg, addr)
 #define ADD    daddu
 #define SUB    dsubu
 #define SRL    dsrl
@@ -404,14 +389,14 @@ EXPORT_SYMBOL(csum_partial)
 #else
 
 #define LOADK	lw /* No exception */
-#define LOAD(reg, addr, handler)	EXC(lw, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(lwl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(lwr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(swl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(swr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sw, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(lw, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(lwl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(lwr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(swl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(swr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sw, ST_INSN, reg, addr)
 #define ADD    addu
 #define SUB    subu
 #define SRL    srl
@@ -450,22 +435,9 @@ EXPORT_SYMBOL(csum_partial)
 	.set	at=v1
 #endif
 
-	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to, __nocheck
+	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to
 
-	PTR_ADDU	AT, src, len	/* See (1) above. */
-	/* initialize __nocheck if this the first time we execute this
-	 * macro
-	 */
-#ifdef CONFIG_64BIT
-	move	errptr, a4
-#else
-	lw	errptr, 16(sp)
-#endif
-	.if \__nocheck == 1
-	FEXPORT(__csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
-	.endif
-	move	sum, zero
+	li	sum, -1
 	move	odd, zero
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
@@ -497,31 +469,31 @@ EXPORT_SYMBOL(csum_partial)
 	SUB	len, 8*NBYTES		# subtract here for bgez loop
 	.align	4
 1:
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
-	LOAD(t4, UNIT(4)(src), .Ll_exc_copy\@)
-	LOAD(t5, UNIT(5)(src), .Ll_exc_copy\@)
-	LOAD(t6, UNIT(6)(src), .Ll_exc_copy\@)
-	LOAD(t7, UNIT(7)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
+	LOAD(t4, UNIT(4)(src))
+	LOAD(t5, UNIT(5)(src))
+	LOAD(t6, UNIT(6)(src))
+	LOAD(t7, UNIT(7)(src))
 	SUB	len, len, 8*NBYTES
 	ADD	src, src, 8*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
-	STORE(t4, UNIT(4)(dst),	.Ls_exc\@)
+	STORE(t4, UNIT(4)(dst))
 	ADDC(t4, t5)
-	STORE(t5, UNIT(5)(dst),	.Ls_exc\@)
+	STORE(t5, UNIT(5)(dst))
 	ADDC(sum, t4)
-	STORE(t6, UNIT(6)(dst),	.Ls_exc\@)
+	STORE(t6, UNIT(6)(dst))
 	ADDC(t6, t7)
-	STORE(t7, UNIT(7)(dst),	.Ls_exc\@)
+	STORE(t7, UNIT(7)(dst))
 	ADDC(sum, t6)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 8*NBYTES
@@ -541,19 +513,19 @@ EXPORT_SYMBOL(csum_partial)
 	/*
 	 * len >= 4*NBYTES
 	 */
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
 	SUB	len, len, 4*NBYTES
 	ADD	src, src, 4*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -566,10 +538,10 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -592,10 +564,10 @@ EXPORT_SYMBOL(csum_partial)
 	 ADD	t1, dst, len	# t1 is just past last byte of dst
 	li	bits, 8*NBYTES
 	SLL	rem, len, 3	# rem = number of bits to keep
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	SUB	bits, bits, rem # bits = number of bits to discard
 	SHIFT_DISCARD t0, t0, bits
-	STREST(t0, -1(t1), .Ls_exc\@)
+	STREST(t0, -1(t1))
 	SHIFT_DISCARD_REVERT t0, t0, bits
 	.set reorder
 	ADDC(sum, t0)
@@ -612,12 +584,12 @@ EXPORT_SYMBOL(csum_partial)
 	 * Set match = (src and dst have same alignment)
 	 */
 #define match rem
-	LDFIRST(t3, FIRST(0)(src), .Ll_exc\@)
+	LDFIRST(t3, FIRST(0)(src))
 	ADD	t2, zero, NBYTES
-	LDREST(t3, REST(0)(src), .Ll_exc_copy\@)
+	LDREST(t3, REST(0)(src))
 	SUB	t2, t2, t1	# t2 = number of bytes copied
 	xor	match, t0, t1
-	STFIRST(t3, FIRST(0)(dst), .Ls_exc\@)
+	STFIRST(t3, FIRST(0)(dst))
 	SLL	t4, t1, 3		# t4 = number of bits to discard
 	SHIFT_DISCARD t3, t3, t4
 	/* no SHIFT_DISCARD_REVERT to handle odd buffer properly */
@@ -639,26 +611,26 @@ EXPORT_SYMBOL(csum_partial)
  * It's OK to load FIRST(N+1) before REST(N) because the two addresses
  * are to the same unit (unless src is aligned, but it's not).
  */
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDFIRST(t1, FIRST(1)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDFIRST(t1, FIRST(1)(src))
 	SUB	len, len, 4*NBYTES
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
-	LDREST(t1, REST(1)(src), .Ll_exc_copy\@)
-	LDFIRST(t2, FIRST(2)(src), .Ll_exc_copy\@)
-	LDFIRST(t3, FIRST(3)(src), .Ll_exc_copy\@)
-	LDREST(t2, REST(2)(src), .Ll_exc_copy\@)
-	LDREST(t3, REST(3)(src), .Ll_exc_copy\@)
+	LDREST(t0, REST(0)(src))
+	LDREST(t1, REST(1)(src))
+	LDFIRST(t2, FIRST(2)(src))
+	LDFIRST(t3, FIRST(3)(src))
+	LDREST(t2, REST(2)(src))
+	LDREST(t3, REST(3)(src))
 	ADD	src, src, 4*NBYTES
 #ifdef CONFIG_CPU_SB1
 	nop				# improves slotting
 #endif
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -671,11 +643,11 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDREST(t0, REST(0)(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -696,11 +668,10 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	move	t2, zero	# partial word
 	li	t3, SHIFT_START # shift
-/* use .Ll_exc_copy here to return correct sum on fault */
 #define COPY_BYTE(N)			\
-	LOADBU(t0, N(src), .Ll_exc_copy\@);	\
+	LOADBU(t0, N(src));		\
 	SUB	len, len, 1;		\
-	STOREB(t0, N(dst), .Ls_exc\@);	\
+	STOREB(t0, N(dst));		\
 	SLLV	t0, t0, t3;		\
 	addu	t3, SHIFT_INC;		\
 	beqz	len, .Lcopy_bytes_done\@; \
@@ -714,9 +685,9 @@ EXPORT_SYMBOL(csum_partial)
 	COPY_BYTE(4)
 	COPY_BYTE(5)
 #endif
-	LOADBU(t0, NBYTES-2(src), .Ll_exc_copy\@)
+	LOADBU(t0, NBYTES-2(src))
 	SUB	len, len, 1
-	STOREB(t0, NBYTES-2(dst), .Ls_exc\@)
+	STOREB(t0, NBYTES-2(dst))
 	SLLV	t0, t0, t3
 	or	t2, t0
 .Lcopy_bytes_done\@:
@@ -753,94 +724,31 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	.set	pop
 	.set reorder
-	ADDC32(sum, psum)
 	jr	ra
 	.set noreorder
+	.endm
 
-.Ll_exc_copy\@:
-	/*
-	 * Copy bytes from src until faulting load address (or until a
-	 * lb faults)
-	 *
-	 * When reached by a faulting LDFIRST/LDREST, THREAD_BUADDR($28)
-	 * may be more than a byte beyond the last address.
-	 * Hence, the lb below may get an exception.
-	 *
-	 * Assumes src < THREAD_BUADDR($28)
-	 */
-	LOADK	t0, TI_TASK($28)
-	 li	t2, SHIFT_START
-	LOADK	t0, THREAD_BUADDR(t0)
-1:
-	LOADBU(t1, 0(src), .Ll_exc\@)
-	ADD	src, src, 1
-	sb	t1, 0(dst)	# can't fault -- we're copy_from_user
-	SLLV	t1, t1, t2
-	addu	t2, SHIFT_INC
-	ADDC(sum, t1)
-	.set	reorder				/* DADDI_WAR */
-	ADD	dst, dst, 1
-	bne	src, t0, 1b
-	.set	noreorder
-.Ll_exc\@:
-	LOADK	t0, TI_TASK($28)
-	 nop
-	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
-	 nop
-	SUB	len, AT, t0		# len number of uncopied bytes
-	/*
-	 * Here's where we rely on src and dst being incremented in tandem,
-	 *   See (3) above.
-	 * dst += (fault addr - src) to put dst at first byte to clear
-	 */
-	ADD	dst, t0			# compute start address in a1
-	SUB	dst, src
-	/*
-	 * Clear len bytes starting at dst.  Can't call __bzero because it
-	 * might modify len.  An inefficient loop for these rare times...
-	 */
-	.set	reorder				/* DADDI_WAR */
-	SUB	src, len, 1
-	beqz	len, .Ldone\@
-	.set	noreorder
-1:	sb	zero, 0(dst)
-	ADD	dst, dst, 1
-	.set	push
-	.set	noat
-#ifndef CONFIG_CPU_DADDI_WORKAROUNDS
-	bnez	src, 1b
-	 SUB	src, src, 1
-#else
-	li	v1, 1
-	bnez	src, 1b
-	 SUB	src, src, v1
-#endif
-	li	v1, -EFAULT
-	b	.Ldone\@
-	 sw	v1, (errptr)
-
-.Ls_exc\@:
-	li	v0, -1 /* invalid checksum */
-	li	v1, -EFAULT
+	.set noreorder
+.L_exc:
 	jr	ra
-	 sw	v1, (errptr)
-	.set	pop
-	.endm
+	 li	v0, 0
 
+FEXPORT(__csum_partial_copy_nocheck)
+EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
 FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
-__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
+__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP
 END(__csum_partial_copy_to_user)
 
 LEAF(__csum_partial_copy_from_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP
 END(__csum_partial_copy_from_user)
 #endif
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 14/18] mips: propagate the calling convention change down into __csum_partial_copy_..._user()
  2020-07-21 20:25   ` [PATCH 14/18] mips: propagate the calling convention change down into __csum_partial_copy_..._user() Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

and turn the exception handlers into simply returning 0, which
simplifies the hell out of things in csum_partial.S

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h |  26 +---
 arch/mips/lib/csum_partial.S     | 258 +++++++++++++--------------------------
 2 files changed, 89 insertions(+), 195 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index bf02d2d3869f..66a86a33339a 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,25 +34,17 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_from_user(const void *src, void *dst,
-				     int len, __wsum sum, int *err_ptr);
-__wsum __csum_partial_copy_to_user(const void *src, void *dst,
-				   int len, __wsum sum, int *err_ptr);
+__wsum __csum_partial_copy_from_user(const void __user *src, void *dst, int len);
+__wsum __csum_partial_copy_to_user(const void *src, void __user *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	__wsum sum = ~0U;
-	int err = 0;
-
 	might_fault();
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = __csum_partial_copy_from_user((__force void *)src, dst,
-						     len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_from_user(src, dst, len);
 }
 
 /*
@@ -62,26 +54,20 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	might_fault();
 	if (!access_ok(dst, len))
 		return 0;
-	sum = __csum_partial_copy_to_user(src,
-					   (__force void *)dst,
-					   len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_to_user(src, dst, len);
 }
 
 /*
  * the same as csum_partial, but copies from user space (but on MIPS
  * we have just one address space, so this is identical to the above)
  */
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len);
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
+	return __csum_partial_copy_nocheck(src, dst, len);
 }
 #define csum_partial_copy_nocheck csum_partial_copy_nocheck
 
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 983e909c2052..a46db0807195 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -308,8 +308,8 @@ EXPORT_SYMBOL(csum_partial)
 /*
  * checksum and copy routines based on memcpy.S
  *
- *	csum_partial_copy_nocheck(src, dst, len, sum)
- *	__csum_partial_copy_kernel(src, dst, len, sum, errp)
+ *	csum_partial_copy_nocheck(src, dst, len)
+ *	__csum_partial_copy_kernel(src, dst, len)
  *
  * See "Spec" in memcpy.S for details.	Unlike __copy_user, all
  * function in this file use the standard calling convention.
@@ -318,26 +318,11 @@ EXPORT_SYMBOL(csum_partial)
 #define src a0
 #define dst a1
 #define len a2
-#define psum a3
 #define sum v0
 #define odd t8
-#define errptr t9
 
 /*
- * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
- *     of the copy,
- *  2- src_entry <= src < AT, and
- *  3- (dst - src) == (dst_entry - src_entry),
- * The _entry suffix denotes values when __copy_user was called.
- *
- * (1) is set up up by __csum_partial_copy_from_user and maintained by
- *	not writing AT in __csum_partial_copy
- * (2) is met by incrementing src by the number of bytes copied
- * (3) is met by not doing loads between a pair of increments of dst and src
- *
- * The exception handlers for stores stores -EFAULT to errptr and return.
- * These handlers do not need to overwrite any data.
+ * All exception handlers simply return 0.
  */
 
 /* Instruction type */
@@ -358,11 +343,11 @@ EXPORT_SYMBOL(csum_partial)
  * addr    : Address
  * handler : Exception handler
  */
-#define EXC(insn, type, reg, addr, handler)	\
+#define EXC(insn, type, reg, addr)		\
 	.if \mode == LEGACY_MODE;		\
 9:		insn reg, addr;			\
 		.section __ex_table,"a";	\
-		PTR	9b, handler;		\
+		PTR	9b, .L_exc;		\
 		.previous;			\
 	/* This is enabled in EVA mode */	\
 	.else;					\
@@ -371,7 +356,7 @@ EXPORT_SYMBOL(csum_partial)
 		    ((\to == USEROP) && (type == ST_INSN));	\
 9:			__BUILD_EVA_INSN(insn##e, reg, addr);	\
 			.section __ex_table,"a";		\
-			PTR	9b, handler;			\
+			PTR	9b, .L_exc;			\
 			.previous;				\
 		.else;						\
 			/* EVA without exception */		\
@@ -384,14 +369,14 @@ EXPORT_SYMBOL(csum_partial)
 #ifdef USE_DOUBLE
 
 #define LOADK	ld /* No exception */
-#define LOAD(reg, addr, handler)	EXC(ld, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(ldl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(ldr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(sdl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(sdr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sd, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(ld, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(ldl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(ldr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(sdl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(sdr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sd, ST_INSN, reg, addr)
 #define ADD    daddu
 #define SUB    dsubu
 #define SRL    dsrl
@@ -404,14 +389,14 @@ EXPORT_SYMBOL(csum_partial)
 #else
 
 #define LOADK	lw /* No exception */
-#define LOAD(reg, addr, handler)	EXC(lw, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(lwl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(lwr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(swl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(swr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sw, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(lw, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(lwl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(lwr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(swl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(swr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sw, ST_INSN, reg, addr)
 #define ADD    addu
 #define SUB    subu
 #define SRL    srl
@@ -450,22 +435,9 @@ EXPORT_SYMBOL(csum_partial)
 	.set	at=v1
 #endif
 
-	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to, __nocheck
+	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to
 
-	PTR_ADDU	AT, src, len	/* See (1) above. */
-	/* initialize __nocheck if this the first time we execute this
-	 * macro
-	 */
-#ifdef CONFIG_64BIT
-	move	errptr, a4
-#else
-	lw	errptr, 16(sp)
-#endif
-	.if \__nocheck == 1
-	FEXPORT(__csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
-	.endif
-	move	sum, zero
+	li	sum, -1
 	move	odd, zero
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
@@ -497,31 +469,31 @@ EXPORT_SYMBOL(csum_partial)
 	SUB	len, 8*NBYTES		# subtract here for bgez loop
 	.align	4
 1:
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
-	LOAD(t4, UNIT(4)(src), .Ll_exc_copy\@)
-	LOAD(t5, UNIT(5)(src), .Ll_exc_copy\@)
-	LOAD(t6, UNIT(6)(src), .Ll_exc_copy\@)
-	LOAD(t7, UNIT(7)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
+	LOAD(t4, UNIT(4)(src))
+	LOAD(t5, UNIT(5)(src))
+	LOAD(t6, UNIT(6)(src))
+	LOAD(t7, UNIT(7)(src))
 	SUB	len, len, 8*NBYTES
 	ADD	src, src, 8*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
-	STORE(t4, UNIT(4)(dst),	.Ls_exc\@)
+	STORE(t4, UNIT(4)(dst))
 	ADDC(t4, t5)
-	STORE(t5, UNIT(5)(dst),	.Ls_exc\@)
+	STORE(t5, UNIT(5)(dst))
 	ADDC(sum, t4)
-	STORE(t6, UNIT(6)(dst),	.Ls_exc\@)
+	STORE(t6, UNIT(6)(dst))
 	ADDC(t6, t7)
-	STORE(t7, UNIT(7)(dst),	.Ls_exc\@)
+	STORE(t7, UNIT(7)(dst))
 	ADDC(sum, t6)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 8*NBYTES
@@ -541,19 +513,19 @@ EXPORT_SYMBOL(csum_partial)
 	/*
 	 * len >= 4*NBYTES
 	 */
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
 	SUB	len, len, 4*NBYTES
 	ADD	src, src, 4*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -566,10 +538,10 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -592,10 +564,10 @@ EXPORT_SYMBOL(csum_partial)
 	 ADD	t1, dst, len	# t1 is just past last byte of dst
 	li	bits, 8*NBYTES
 	SLL	rem, len, 3	# rem = number of bits to keep
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	SUB	bits, bits, rem # bits = number of bits to discard
 	SHIFT_DISCARD t0, t0, bits
-	STREST(t0, -1(t1), .Ls_exc\@)
+	STREST(t0, -1(t1))
 	SHIFT_DISCARD_REVERT t0, t0, bits
 	.set reorder
 	ADDC(sum, t0)
@@ -612,12 +584,12 @@ EXPORT_SYMBOL(csum_partial)
 	 * Set match = (src and dst have same alignment)
 	 */
 #define match rem
-	LDFIRST(t3, FIRST(0)(src), .Ll_exc\@)
+	LDFIRST(t3, FIRST(0)(src))
 	ADD	t2, zero, NBYTES
-	LDREST(t3, REST(0)(src), .Ll_exc_copy\@)
+	LDREST(t3, REST(0)(src))
 	SUB	t2, t2, t1	# t2 = number of bytes copied
 	xor	match, t0, t1
-	STFIRST(t3, FIRST(0)(dst), .Ls_exc\@)
+	STFIRST(t3, FIRST(0)(dst))
 	SLL	t4, t1, 3		# t4 = number of bits to discard
 	SHIFT_DISCARD t3, t3, t4
 	/* no SHIFT_DISCARD_REVERT to handle odd buffer properly */
@@ -639,26 +611,26 @@ EXPORT_SYMBOL(csum_partial)
  * It's OK to load FIRST(N+1) before REST(N) because the two addresses
  * are to the same unit (unless src is aligned, but it's not).
  */
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDFIRST(t1, FIRST(1)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDFIRST(t1, FIRST(1)(src))
 	SUB	len, len, 4*NBYTES
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
-	LDREST(t1, REST(1)(src), .Ll_exc_copy\@)
-	LDFIRST(t2, FIRST(2)(src), .Ll_exc_copy\@)
-	LDFIRST(t3, FIRST(3)(src), .Ll_exc_copy\@)
-	LDREST(t2, REST(2)(src), .Ll_exc_copy\@)
-	LDREST(t3, REST(3)(src), .Ll_exc_copy\@)
+	LDREST(t0, REST(0)(src))
+	LDREST(t1, REST(1)(src))
+	LDFIRST(t2, FIRST(2)(src))
+	LDFIRST(t3, FIRST(3)(src))
+	LDREST(t2, REST(2)(src))
+	LDREST(t3, REST(3)(src))
 	ADD	src, src, 4*NBYTES
 #ifdef CONFIG_CPU_SB1
 	nop				# improves slotting
 #endif
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -671,11 +643,11 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDREST(t0, REST(0)(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -696,11 +668,10 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	move	t2, zero	# partial word
 	li	t3, SHIFT_START # shift
-/* use .Ll_exc_copy here to return correct sum on fault */
 #define COPY_BYTE(N)			\
-	LOADBU(t0, N(src), .Ll_exc_copy\@);	\
+	LOADBU(t0, N(src));		\
 	SUB	len, len, 1;		\
-	STOREB(t0, N(dst), .Ls_exc\@);	\
+	STOREB(t0, N(dst));		\
 	SLLV	t0, t0, t3;		\
 	addu	t3, SHIFT_INC;		\
 	beqz	len, .Lcopy_bytes_done\@; \
@@ -714,9 +685,9 @@ EXPORT_SYMBOL(csum_partial)
 	COPY_BYTE(4)
 	COPY_BYTE(5)
 #endif
-	LOADBU(t0, NBYTES-2(src), .Ll_exc_copy\@)
+	LOADBU(t0, NBYTES-2(src))
 	SUB	len, len, 1
-	STOREB(t0, NBYTES-2(dst), .Ls_exc\@)
+	STOREB(t0, NBYTES-2(dst))
 	SLLV	t0, t0, t3
 	or	t2, t0
 .Lcopy_bytes_done\@:
@@ -753,94 +724,31 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	.set	pop
 	.set reorder
-	ADDC32(sum, psum)
 	jr	ra
 	.set noreorder
+	.endm
 
-.Ll_exc_copy\@:
-	/*
-	 * Copy bytes from src until faulting load address (or until a
-	 * lb faults)
-	 *
-	 * When reached by a faulting LDFIRST/LDREST, THREAD_BUADDR($28)
-	 * may be more than a byte beyond the last address.
-	 * Hence, the lb below may get an exception.
-	 *
-	 * Assumes src < THREAD_BUADDR($28)
-	 */
-	LOADK	t0, TI_TASK($28)
-	 li	t2, SHIFT_START
-	LOADK	t0, THREAD_BUADDR(t0)
-1:
-	LOADBU(t1, 0(src), .Ll_exc\@)
-	ADD	src, src, 1
-	sb	t1, 0(dst)	# can't fault -- we're copy_from_user
-	SLLV	t1, t1, t2
-	addu	t2, SHIFT_INC
-	ADDC(sum, t1)
-	.set	reorder				/* DADDI_WAR */
-	ADD	dst, dst, 1
-	bne	src, t0, 1b
-	.set	noreorder
-.Ll_exc\@:
-	LOADK	t0, TI_TASK($28)
-	 nop
-	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
-	 nop
-	SUB	len, AT, t0		# len number of uncopied bytes
-	/*
-	 * Here's where we rely on src and dst being incremented in tandem,
-	 *   See (3) above.
-	 * dst += (fault addr - src) to put dst at first byte to clear
-	 */
-	ADD	dst, t0			# compute start address in a1
-	SUB	dst, src
-	/*
-	 * Clear len bytes starting at dst.  Can't call __bzero because it
-	 * might modify len.  An inefficient loop for these rare times...
-	 */
-	.set	reorder				/* DADDI_WAR */
-	SUB	src, len, 1
-	beqz	len, .Ldone\@
-	.set	noreorder
-1:	sb	zero, 0(dst)
-	ADD	dst, dst, 1
-	.set	push
-	.set	noat
-#ifndef CONFIG_CPU_DADDI_WORKAROUNDS
-	bnez	src, 1b
-	 SUB	src, src, 1
-#else
-	li	v1, 1
-	bnez	src, 1b
-	 SUB	src, src, v1
-#endif
-	li	v1, -EFAULT
-	b	.Ldone\@
-	 sw	v1, (errptr)
-
-.Ls_exc\@:
-	li	v0, -1 /* invalid checksum */
-	li	v1, -EFAULT
+	.set noreorder
+.L_exc:
 	jr	ra
-	 sw	v1, (errptr)
-	.set	pop
-	.endm
+	 li	v0, 0
 
+FEXPORT(__csum_partial_copy_nocheck)
+EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
 FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
-__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
+__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP
 END(__csum_partial_copy_to_user)
 
 LEAF(__csum_partial_copy_from_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP
 END(__csum_partial_copy_from_user)
 #endif
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 15/18] xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (12 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 14/18] mips: propagate the calling convention change down into __csum_partial_copy_..._user() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-22  8:56     ` Max Filippov
  2020-07-21 20:25   ` [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
                     ` (2 subsequent siblings)
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

turn the exception handlers into returning 0.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/xtensa/include/asm/checksum.h | 20 +++---------
 arch/xtensa/lib/checksum.S         | 67 +++++++++-----------------------------
 2 files changed, 19 insertions(+), 68 deletions(-)

diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 7958b18a5804..23b3e7c7ff73 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -37,9 +37,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 /*
  *	Note: when you get a NULL pointer exception here this means someone
@@ -48,7 +46,7 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 static inline
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
@@ -56,14 +54,9 @@ static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 				   int len)
 {
-	int err = 0;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	sum = csum_partial_copy_generic((__force const void *)src, dst,
-					len, ~0U, &err, NULL);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 
 /*
@@ -246,13 +239,8 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
 					       void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic(src, (__force void *)dst, len);
 }
 #endif
diff --git a/arch/xtensa/lib/checksum.S b/arch/xtensa/lib/checksum.S
index 4cb9ca58d9ad..cf1bed1a5bd6 100644
--- a/arch/xtensa/lib/checksum.S
+++ b/arch/xtensa/lib/checksum.S
@@ -175,19 +175,14 @@ ENDPROC(csum_partial)
  */
 
 /*
-unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
-					int sum, int *src_err_ptr, int *dst_err_ptr)
+unsigned int csum_partial_copy_generic (const char *src, char *dst, int len)
 	a2  = src
 	a3  = dst
 	a4  = len
 	a5  = sum
-	a6  = src_err_ptr
-	a7  = dst_err_ptr
 	a8  = temp
 	a9  = temp
 	a10 = temp
-	a11 = original len for exception handling
-	a12 = original dst for exception handling
 
     This function is optimized for 4-byte aligned addresses.  Other
     alignments work, but not nearly as efficiently.
@@ -196,8 +191,7 @@ unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
 ENTRY(csum_partial_copy_generic)
 
 	abi_entry_default
-	mov	a12, a3
-	mov	a11, a4
+	movi	a5, -1
 	or	a10, a2, a3
 
 	/* We optimize the following alignment tests for the 4-byte
@@ -228,26 +222,26 @@ ENTRY(csum_partial_copy_generic)
 #endif
 EX(10f)	l32i	a9, a2, 0
 EX(10f)	l32i	a8, a2, 4
-EX(11f)	s32i	a9, a3, 0
-EX(11f)	s32i	a8, a3, 4
+EX(10f)	s32i	a9, a3, 0
+EX(10f)	s32i	a8, a3, 4
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 8
 EX(10f)	l32i	a8, a2, 12
-EX(11f)	s32i	a9, a3, 8
-EX(11f)	s32i	a8, a3, 12
+EX(10f)	s32i	a9, a3, 8
+EX(10f)	s32i	a8, a3, 12
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 16
 EX(10f)	l32i	a8, a2, 20
-EX(11f)	s32i	a9, a3, 16
-EX(11f)	s32i	a8, a3, 20
+EX(10f)	s32i	a9, a3, 16
+EX(10f)	s32i	a8, a3, 20
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 24
 EX(10f)	l32i	a8, a2, 28
-EX(11f)	s32i	a9, a3, 24
-EX(11f)	s32i	a8, a3, 28
+EX(10f)	s32i	a9, a3, 24
+EX(10f)	s32i	a8, a3, 28
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 	addi	a2, a2, 32
@@ -267,7 +261,7 @@ EX(11f)	s32i	a8, a3, 28
 .Loop6:
 #endif
 EX(10f)	l32i	a9, a2, 0
-EX(11f)	s32i	a9, a3, 0
+EX(10f)	s32i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 4
 	addi	a3, a3, 4
@@ -298,7 +292,7 @@ EX(11f)	s32i	a9, a3, 0
 .Loop7:
 #endif
 EX(10f)	l16ui	a9, a2, 0
-EX(11f)	s16i	a9, a3, 0
+EX(10f)	s16i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 2
 	addi	a3, a3, 2
@@ -309,7 +303,7 @@ EX(11f)	s16i	a9, a3, 0
 	/* This section processes a possible trailing odd byte. */
 	_bbci.l	a4, 0, 8f	/* 1-byte chunk */
 EX(10f)	l8ui	a9, a2, 0
-EX(11f)	s8i	a9, a3, 0
+EX(10f)	s8i	a9, a3, 0
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* shift byte to bits 8..15 */
 #endif
@@ -334,8 +328,8 @@ EX(11f)	s8i	a9, a3, 0
 #endif
 EX(10f)	l8ui	a9, a2, 0
 EX(10f)	l8ui	a8, a2, 1
-EX(11f)	s8i	a9, a3, 0
-EX(11f)	s8i	a8, a3, 1
+EX(10f)	s8i	a9, a3, 0
+EX(10f)	s8i	a8, a3, 1
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* combine into a single 16-bit value */
 #else				/* for checksum computation */
@@ -356,38 +350,7 @@ ENDPROC(csum_partial_copy_generic)
 
 # Exception handler:
 .section .fixup, "ax"
-/*
-	a6  = src_err_ptr
-	a7  = dst_err_ptr
-	a11 = original len for exception handling
-	a12 = original dst for exception handling
-*/
-
 10:
-	_movi	a2, -EFAULT
-	s32i	a2, a6, 0	/* src_err_ptr */
-
-	# clear the complete destination - computing the rest
-	# is too much work
-	movi	a2, 0
-#if XCHAL_HAVE_LOOPS
-	loopgtz	a11, 2f
-#else
-	beqz	a11, 2f
-	add	a11, a11, a12	/* a11 = ending address */
-.Leloop:
-#endif
-	s8i	a2, a12, 0
-	addi	a12, a12, 1
-#if !XCHAL_HAVE_LOOPS
-	blt	a12, a11, .Leloop
-#endif
-2:
-	abi_ret_default
-
-11:
-	movi	a2, -EFAULT
-	s32i	a2, a7, 0	/* dst_err_ptr */
 	movi	a2, 0
 	abi_ret_default
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (13 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 15/18] xtensa: propagate the calling conventions change down into csum_partial_copy_generic() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-22  1:21     ` David Miller
  2020-07-21 20:25   ` [PATCH 17/18] amd64: switch csum_partial_copy_generic() to new calling conventions Al Viro
  2020-07-21 20:25   ` [PATCH 18/18] ppc: propagate the calling conventions change down to csum_partial_copy_generic() Al Viro
  16 siblings, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and rename them into csum_and_copy_...() - the wrappers become pointless.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum.h    |  1 +
 arch/sparc/include/asm/checksum_32.h |  2 --
 arch/sparc/include/asm/checksum_64.h | 41 +++---------------------------------
 arch/sparc/lib/csum_copy.S           |  5 +++--
 arch/sparc/lib/csum_copy_from_user.S |  4 ++--
 arch/sparc/lib/csum_copy_to_user.S   |  4 ++--
 6 files changed, 11 insertions(+), 46 deletions(-)

diff --git a/arch/sparc/include/asm/checksum.h b/arch/sparc/include/asm/checksum.h
index a6256cb6fc5c..f38a16ced6d2 100644
--- a/arch/sparc/include/asm/checksum.h
+++ b/arch/sparc/include/asm/checksum.h
@@ -2,6 +2,7 @@
 #ifndef ___ASM_SPARC_CHECKSUM_H
 #define ___ASM_SPARC_CHECKSUM_H
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define HAVE_CSUM_COPY_USER
 #if defined(__sparc__) && defined(__arch64__)
 #include <asm/checksum_64.h>
 #else
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d55e480172a6..ce11e0ad80c7 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -67,8 +67,6 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
-#define HAVE_CSUM_COPY_USER
-
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 4d0bbff43e62..d6b59461e064 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,44 +38,9 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
-
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
-{
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
-}
-
-long __csum_partial_copy_from_user(const void __user *src,
-				   void *dst, int len,
-				   __wsum sum);
-
-static inline __wsum
-csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len)
-{
-	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
-
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-long __csum_partial_copy_to_user(const void *src,
-				 void __user *dst, int len,
-				 __wsum sum);
-
-static inline __wsum
-csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len)
-{
-	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
  * the majority of the time.
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 72c900d21b12..5c085547a24f 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	__csum_partial_copy_nocheck
+#define FUNC_NAME	csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
@@ -68,9 +68,10 @@
 	.globl		FUNC_NAME
 	.type		FUNC_NAME,#function
 	EXPORT_SYMBOL(FUNC_NAME)
-FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len, %o3=sum */
+FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len */
 	LOAD(prefetch, %o0 + 0x000, #n_reads)
 	xor		%o0, %o1, %g1
+	movl		%o3, -1
 	clr		%o4
 	andcc		%g1, 0x3, %g0
 	bne,pn		%icc, 95f
diff --git a/arch/sparc/lib/csum_copy_from_user.S b/arch/sparc/lib/csum_copy_from_user.S
index d20b9594f0c7..b0ba8d4dd439 100644
--- a/arch/sparc/lib/csum_copy_from_user.S
+++ b/arch/sparc/lib/csum_copy_from_user.S
@@ -9,14 +9,14 @@
 	.section .fixup, "ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_from_user
+#define FUNC_NAME		csum_and_copy_from_user
 #define LOAD(type,addr,dest)	type##a [addr] %asi, dest
 
 #include "csum_copy.S"
diff --git a/arch/sparc/lib/csum_copy_to_user.S b/arch/sparc/lib/csum_copy_to_user.S
index d71c0c81e8ab..91ba36dbf7d2 100644
--- a/arch/sparc/lib/csum_copy_to_user.S
+++ b/arch/sparc/lib/csum_copy_to_user.S
@@ -9,14 +9,14 @@
 	.section .fixup,"ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_to_user
+#define FUNC_NAME		csum_and_copy_to_user
 #define STORE(type,src,addr)	type##a src, [addr] %asi
 
 #include "csum_copy.S"
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  2020-07-21 20:25   ` [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
@ 2020-07-21 20:25     ` Al Viro
  2020-07-22  1:21     ` David Miller
  1 sibling, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and rename them into csum_and_copy_...() - the wrappers become pointless.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum.h    |  1 +
 arch/sparc/include/asm/checksum_32.h |  2 --
 arch/sparc/include/asm/checksum_64.h | 41 +++---------------------------------
 arch/sparc/lib/csum_copy.S           |  5 +++--
 arch/sparc/lib/csum_copy_from_user.S |  4 ++--
 arch/sparc/lib/csum_copy_to_user.S   |  4 ++--
 6 files changed, 11 insertions(+), 46 deletions(-)

diff --git a/arch/sparc/include/asm/checksum.h b/arch/sparc/include/asm/checksum.h
index a6256cb6fc5c..f38a16ced6d2 100644
--- a/arch/sparc/include/asm/checksum.h
+++ b/arch/sparc/include/asm/checksum.h
@@ -2,6 +2,7 @@
 #ifndef ___ASM_SPARC_CHECKSUM_H
 #define ___ASM_SPARC_CHECKSUM_H
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define HAVE_CSUM_COPY_USER
 #if defined(__sparc__) && defined(__arch64__)
 #include <asm/checksum_64.h>
 #else
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d55e480172a6..ce11e0ad80c7 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -67,8 +67,6 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
-#define HAVE_CSUM_COPY_USER
-
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 4d0bbff43e62..d6b59461e064 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,44 +38,9 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
-
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
-{
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
-}
-
-long __csum_partial_copy_from_user(const void __user *src,
-				   void *dst, int len,
-				   __wsum sum);
-
-static inline __wsum
-csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len)
-{
-	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
-
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-long __csum_partial_copy_to_user(const void *src,
-				 void __user *dst, int len,
-				 __wsum sum);
-
-static inline __wsum
-csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len)
-{
-	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
  * the majority of the time.
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 72c900d21b12..5c085547a24f 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	__csum_partial_copy_nocheck
+#define FUNC_NAME	csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
@@ -68,9 +68,10 @@
 	.globl		FUNC_NAME
 	.type		FUNC_NAME,#function
 	EXPORT_SYMBOL(FUNC_NAME)
-FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len, %o3=sum */
+FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len */
 	LOAD(prefetch, %o0 + 0x000, #n_reads)
 	xor		%o0, %o1, %g1
+	movl		%o3, -1
 	clr		%o4
 	andcc		%g1, 0x3, %g0
 	bne,pn		%icc, 95f
diff --git a/arch/sparc/lib/csum_copy_from_user.S b/arch/sparc/lib/csum_copy_from_user.S
index d20b9594f0c7..b0ba8d4dd439 100644
--- a/arch/sparc/lib/csum_copy_from_user.S
+++ b/arch/sparc/lib/csum_copy_from_user.S
@@ -9,14 +9,14 @@
 	.section .fixup, "ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_from_user
+#define FUNC_NAME		csum_and_copy_from_user
 #define LOAD(type,addr,dest)	type##a [addr] %asi, dest
 
 #include "csum_copy.S"
diff --git a/arch/sparc/lib/csum_copy_to_user.S b/arch/sparc/lib/csum_copy_to_user.S
index d71c0c81e8ab..91ba36dbf7d2 100644
--- a/arch/sparc/lib/csum_copy_to_user.S
+++ b/arch/sparc/lib/csum_copy_to_user.S
@@ -9,14 +9,14 @@
 	.section .fixup,"ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_to_user
+#define FUNC_NAME		csum_and_copy_to_user
 #define STORE(type,src,addr)	type##a src, [addr] %asi
 
 #include "csum_copy.S"
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 17/18] amd64: switch csum_partial_copy_generic() to new calling conventions
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (14 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:25   ` [PATCH 18/18] ppc: propagate the calling conventions change down to csum_partial_copy_generic() Al Viro
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and fold handling of misaligned case into it.

Implementation note: we stash the "will we need to rol8 the sum in the end"
flag into the MSB of %rcx (the lower 32 bits are used for length); the rest
is pretty straightforward.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_64.h |   5 +-
 arch/x86/lib/csum-copy_64.S        | 140 ++++++++++++++++++++++---------------
 arch/x86/lib/csum-wrappers_64.c    |  72 +++----------------
 3 files changed, 94 insertions(+), 123 deletions(-)

diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 9af3aed54c6b..407beebadaf4 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -130,10 +130,7 @@ static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
 /* Do not call this directly. Use the wrappers below */
-extern __visible __wsum csum_partial_copy_generic(const void *src, const void *dst,
-					int len, __wsum sum,
-					int *src_err_ptr, int *dst_err_ptr);
-
+extern __visible __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
diff --git a/arch/x86/lib/csum-copy_64.S b/arch/x86/lib/csum-copy_64.S
index 3394a8ff7fd0..1fbd8ee9642d 100644
--- a/arch/x86/lib/csum-copy_64.S
+++ b/arch/x86/lib/csum-copy_64.S
@@ -18,9 +18,6 @@
  * rdi  source
  * rsi  destination
  * edx  len (32bit)
- * ecx  sum (32bit)
- * r8   src_err_ptr (int)
- * r9   dst_err_ptr (int)
  *
  * Output
  * eax  64bit sum. undefined in case of exception.
@@ -31,44 +28,32 @@
 
 	.macro source
 10:
-	_ASM_EXTABLE_UA(10b, .Lbad_source)
+	_ASM_EXTABLE_UA(10b, .Lfault)
 	.endm
 
 	.macro dest
 20:
-	_ASM_EXTABLE_UA(20b, .Lbad_dest)
+	_ASM_EXTABLE_UA(20b, .Lfault)
 	.endm
 
-	/*
-	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
-	 * potentially unmapped kernel address.
-	 */
-	.macro ignore L=.Lignore
-30:
-	_ASM_EXTABLE(30b, \L)
-	.endm
-
-
 SYM_FUNC_START(csum_partial_copy_generic)
-	cmpl	$3*64, %edx
-	jle	.Lignore
-
-.Lignore:
-	subq  $7*8, %rsp
-	movq  %rbx, 2*8(%rsp)
-	movq  %r12, 3*8(%rsp)
-	movq  %r14, 4*8(%rsp)
-	movq  %r13, 5*8(%rsp)
-	movq  %r15, 6*8(%rsp)
+	subq  $5*8, %rsp
+	movq  %rbx, 0*8(%rsp)
+	movq  %r12, 1*8(%rsp)
+	movq  %r14, 2*8(%rsp)
+	movq  %r13, 3*8(%rsp)
+	movq  %r15, 4*8(%rsp)
 
-	movq  %r8, (%rsp)
-	movq  %r9, 1*8(%rsp)
-
-	movl  %ecx, %eax
+	movl  $-1, %eax
+	xorl  %r9d, %r9d
 	movl  %edx, %ecx
+	cmpl  $8, %ecx
+	jb    .Lshort
 
-	xorl  %r9d, %r9d
-	movq  %rcx, %r12
+	testb  $7, %sil
+	jne   .Lunaligned
+.Laligned:
+	movl  %ecx, %r12d
 
 	shrq  $6, %r12
 	jz	.Lhandle_tail       /* < 64 */
@@ -99,7 +84,12 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	source
 	movq  56(%rdi), %r13
 
-	ignore 2f
+30:
+	/*
+	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
+	 * potentially unmapped kernel address.
+	 */
+	_ASM_EXTABLE(30b, 2f)
 	prefetcht0 5*64(%rdi)
 2:
 	adcq  %rbx, %rax
@@ -131,8 +121,6 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	dest
 	movq %r13, 56(%rsi)
 
-3:
-
 	leaq 64(%rdi), %rdi
 	leaq 64(%rsi), %rsi
 
@@ -142,8 +130,8 @@ SYM_FUNC_START(csum_partial_copy_generic)
 
 	/* do last up to 56 bytes */
 .Lhandle_tail:
-	/* ecx:	count */
-	movl %ecx, %r10d
+	/* ecx:	count, rcx.63: the end result needs to be rol8 */
+	movq %rcx, %r10
 	andl $63, %ecx
 	shrl $3, %ecx
 	jz	.Lfold
@@ -172,6 +160,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 .Lhandle_7:
 	movl %r10d, %ecx
 	andl $7, %ecx
+.L1:				/* .Lshort rejoins the common path here */
 	shrl $1, %ecx
 	jz   .Lhandle_1
 	movl $2, %edx
@@ -203,26 +192,65 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	adcl %r9d, %eax		/* carry */
 
 .Lende:
-	movq 2*8(%rsp), %rbx
-	movq 3*8(%rsp), %r12
-	movq 4*8(%rsp), %r14
-	movq 5*8(%rsp), %r13
-	movq 6*8(%rsp), %r15
-	addq $7*8, %rsp
+	testq %r10, %r10
+	js  .Lwas_odd
+.Lout:
+	movq 0*8(%rsp), %rbx
+	movq 1*8(%rsp), %r12
+	movq 2*8(%rsp), %r14
+	movq 3*8(%rsp), %r13
+	movq 4*8(%rsp), %r15
+	addq $5*8, %rsp
 	ret
+.Lshort:
+	movl %ecx, %r10d
+	jmp  .L1
+.Lunaligned:
+	xorl %ebx, %ebx
+	testb $1, %sil
+	jne  .Lodd
+1:	testb $2, %sil
+	je   2f
+	source
+	movw (%rdi), %bx
+	dest
+	movw %bx, (%rsi)
+	leaq 2(%rdi), %rdi
+	subq $2, %rcx
+	leaq 2(%rsi), %rsi
+	addq %rbx, %rax
+2:	testb $4, %sil
+	je .Laligned
+	source
+	movl (%rdi), %ebx
+	dest
+	movl %ebx, (%rsi)
+	leaq 4(%rdi), %rdi
+	subq $4, %rcx
+	leaq 4(%rsi), %rsi
+	addq %rbx, %rax
+	jmp .Laligned
+
+.Lodd:
+	source
+	movb (%rdi), %bl
+	dest
+	movb %bl, (%rsi)
+	leaq 1(%rdi), %rdi
+	leaq 1(%rsi), %rsi
+	/* decrement, set MSB */
+	leaq -1(%rcx, %rcx), %rcx
+	rorq $1, %rcx
+	shll $8, %ebx
+	addq %rbx, %rax
+	jmp 1b
+
+.Lwas_odd:
+	roll $8, %eax
+	jmp .Lout
 
-	/* Exception handlers. Very simple, zeroing is done in the wrappers */
-.Lbad_source:
-	movq (%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp  .Lende
-
-.Lbad_dest:
-	movq 8(%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp .Lende
+	/* Exception: just return 0 */
+.Lfault:
+	xorl %eax, %eax
+	jmp  .Lout
 SYM_FUNC_END(csum_partial_copy_generic)
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ae2fb87e2274..189344924a2b 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -21,49 +21,16 @@
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum isum = ~0U;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(src, len))
 		return 0;
-
-	/*
-	 * Why 6, not 7? To handle odd addresses aligned we
-	 * would need to do considerable complications to fix the
-	 * checksum which is defined as an 16bit accumulator. The
-	 * fix alignment code is primarily for performance
-	 * compatibility with 32bit and that will handle odd
-	 * addresses slowly too.
-	 */
-	if (unlikely((unsigned long)src & 6)) {
-		while (((unsigned long)src & 6) && len >= 2) {
-			__u16 val16;
-
-			unsafe_get_user(val16, (const __u16 __user *)src, out);
-
-			*(__u16 *)dst = val16;
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, &err, NULL);
-	user_access_end();
-	if (unlikely(err))
-		isum = 0;
-	return isum;
-
-out:
+	sum = csum_partial_copy_generic((__force const void *)src, dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -79,37 +46,16 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	__wsum ret, isum = ~0U;
-	int err = 0;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(dst, len))
 		return 0;
-
-	if (unlikely((unsigned long)dst & 6)) {
-		while (((unsigned long)dst & 6) && len >= 2) {
-			__u16 val16 = *(__u16 *)src;
-
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			unsafe_put_user(val16, (__u16 __user *)dst, out);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-
-	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, &err);
-	user_access_end();
-	return err ? 0 : ret;
-out:
+	sum = csum_partial_copy_generic(src, (void __force *)dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
@@ -125,7 +71,7 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 17/18] amd64: switch csum_partial_copy_generic() to new calling conventions
  2020-07-21 20:25   ` [PATCH 17/18] amd64: switch csum_partial_copy_generic() to new calling conventions Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and fold handling of misaligned case into it.

Implementation note: we stash the "will we need to rol8 the sum in the end"
flag into the MSB of %rcx (the lower 32 bits are used for length); the rest
is pretty straightforward.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_64.h |   5 +-
 arch/x86/lib/csum-copy_64.S        | 140 ++++++++++++++++++++++---------------
 arch/x86/lib/csum-wrappers_64.c    |  72 +++----------------
 3 files changed, 94 insertions(+), 123 deletions(-)

diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 9af3aed54c6b..407beebadaf4 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -130,10 +130,7 @@ static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
 /* Do not call this directly. Use the wrappers below */
-extern __visible __wsum csum_partial_copy_generic(const void *src, const void *dst,
-					int len, __wsum sum,
-					int *src_err_ptr, int *dst_err_ptr);
-
+extern __visible __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
diff --git a/arch/x86/lib/csum-copy_64.S b/arch/x86/lib/csum-copy_64.S
index 3394a8ff7fd0..1fbd8ee9642d 100644
--- a/arch/x86/lib/csum-copy_64.S
+++ b/arch/x86/lib/csum-copy_64.S
@@ -18,9 +18,6 @@
  * rdi  source
  * rsi  destination
  * edx  len (32bit)
- * ecx  sum (32bit)
- * r8   src_err_ptr (int)
- * r9   dst_err_ptr (int)
  *
  * Output
  * eax  64bit sum. undefined in case of exception.
@@ -31,44 +28,32 @@
 
 	.macro source
 10:
-	_ASM_EXTABLE_UA(10b, .Lbad_source)
+	_ASM_EXTABLE_UA(10b, .Lfault)
 	.endm
 
 	.macro dest
 20:
-	_ASM_EXTABLE_UA(20b, .Lbad_dest)
+	_ASM_EXTABLE_UA(20b, .Lfault)
 	.endm
 
-	/*
-	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
-	 * potentially unmapped kernel address.
-	 */
-	.macro ignore L=.Lignore
-30:
-	_ASM_EXTABLE(30b, \L)
-	.endm
-
-
 SYM_FUNC_START(csum_partial_copy_generic)
-	cmpl	$3*64, %edx
-	jle	.Lignore
-
-.Lignore:
-	subq  $7*8, %rsp
-	movq  %rbx, 2*8(%rsp)
-	movq  %r12, 3*8(%rsp)
-	movq  %r14, 4*8(%rsp)
-	movq  %r13, 5*8(%rsp)
-	movq  %r15, 6*8(%rsp)
+	subq  $5*8, %rsp
+	movq  %rbx, 0*8(%rsp)
+	movq  %r12, 1*8(%rsp)
+	movq  %r14, 2*8(%rsp)
+	movq  %r13, 3*8(%rsp)
+	movq  %r15, 4*8(%rsp)
 
-	movq  %r8, (%rsp)
-	movq  %r9, 1*8(%rsp)
-
-	movl  %ecx, %eax
+	movl  $-1, %eax
+	xorl  %r9d, %r9d
 	movl  %edx, %ecx
+	cmpl  $8, %ecx
+	jb    .Lshort
 
-	xorl  %r9d, %r9d
-	movq  %rcx, %r12
+	testb  $7, %sil
+	jne   .Lunaligned
+.Laligned:
+	movl  %ecx, %r12d
 
 	shrq  $6, %r12
 	jz	.Lhandle_tail       /* < 64 */
@@ -99,7 +84,12 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	source
 	movq  56(%rdi), %r13
 
-	ignore 2f
+30:
+	/*
+	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
+	 * potentially unmapped kernel address.
+	 */
+	_ASM_EXTABLE(30b, 2f)
 	prefetcht0 5*64(%rdi)
 2:
 	adcq  %rbx, %rax
@@ -131,8 +121,6 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	dest
 	movq %r13, 56(%rsi)
 
-3:
-
 	leaq 64(%rdi), %rdi
 	leaq 64(%rsi), %rsi
 
@@ -142,8 +130,8 @@ SYM_FUNC_START(csum_partial_copy_generic)
 
 	/* do last up to 56 bytes */
 .Lhandle_tail:
-	/* ecx:	count */
-	movl %ecx, %r10d
+	/* ecx:	count, rcx.63: the end result needs to be rol8 */
+	movq %rcx, %r10
 	andl $63, %ecx
 	shrl $3, %ecx
 	jz	.Lfold
@@ -172,6 +160,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 .Lhandle_7:
 	movl %r10d, %ecx
 	andl $7, %ecx
+.L1:				/* .Lshort rejoins the common path here */
 	shrl $1, %ecx
 	jz   .Lhandle_1
 	movl $2, %edx
@@ -203,26 +192,65 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	adcl %r9d, %eax		/* carry */
 
 .Lende:
-	movq 2*8(%rsp), %rbx
-	movq 3*8(%rsp), %r12
-	movq 4*8(%rsp), %r14
-	movq 5*8(%rsp), %r13
-	movq 6*8(%rsp), %r15
-	addq $7*8, %rsp
+	testq %r10, %r10
+	js  .Lwas_odd
+.Lout:
+	movq 0*8(%rsp), %rbx
+	movq 1*8(%rsp), %r12
+	movq 2*8(%rsp), %r14
+	movq 3*8(%rsp), %r13
+	movq 4*8(%rsp), %r15
+	addq $5*8, %rsp
 	ret
+.Lshort:
+	movl %ecx, %r10d
+	jmp  .L1
+.Lunaligned:
+	xorl %ebx, %ebx
+	testb $1, %sil
+	jne  .Lodd
+1:	testb $2, %sil
+	je   2f
+	source
+	movw (%rdi), %bx
+	dest
+	movw %bx, (%rsi)
+	leaq 2(%rdi), %rdi
+	subq $2, %rcx
+	leaq 2(%rsi), %rsi
+	addq %rbx, %rax
+2:	testb $4, %sil
+	je .Laligned
+	source
+	movl (%rdi), %ebx
+	dest
+	movl %ebx, (%rsi)
+	leaq 4(%rdi), %rdi
+	subq $4, %rcx
+	leaq 4(%rsi), %rsi
+	addq %rbx, %rax
+	jmp .Laligned
+
+.Lodd:
+	source
+	movb (%rdi), %bl
+	dest
+	movb %bl, (%rsi)
+	leaq 1(%rdi), %rdi
+	leaq 1(%rsi), %rsi
+	/* decrement, set MSB */
+	leaq -1(%rcx, %rcx), %rcx
+	rorq $1, %rcx
+	shll $8, %ebx
+	addq %rbx, %rax
+	jmp 1b
+
+.Lwas_odd:
+	roll $8, %eax
+	jmp .Lout
 
-	/* Exception handlers. Very simple, zeroing is done in the wrappers */
-.Lbad_source:
-	movq (%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp  .Lende
-
-.Lbad_dest:
-	movq 8(%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp .Lende
+	/* Exception: just return 0 */
+.Lfault:
+	xorl %eax, %eax
+	jmp  .Lout
 SYM_FUNC_END(csum_partial_copy_generic)
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ae2fb87e2274..189344924a2b 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -21,49 +21,16 @@
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum isum = ~0U;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(src, len))
 		return 0;
-
-	/*
-	 * Why 6, not 7? To handle odd addresses aligned we
-	 * would need to do considerable complications to fix the
-	 * checksum which is defined as an 16bit accumulator. The
-	 * fix alignment code is primarily for performance
-	 * compatibility with 32bit and that will handle odd
-	 * addresses slowly too.
-	 */
-	if (unlikely((unsigned long)src & 6)) {
-		while (((unsigned long)src & 6) && len >= 2) {
-			__u16 val16;
-
-			unsafe_get_user(val16, (const __u16 __user *)src, out);
-
-			*(__u16 *)dst = val16;
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, &err, NULL);
-	user_access_end();
-	if (unlikely(err))
-		isum = 0;
-	return isum;
-
-out:
+	sum = csum_partial_copy_generic((__force const void *)src, dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -79,37 +46,16 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	__wsum ret, isum = ~0U;
-	int err = 0;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(dst, len))
 		return 0;
-
-	if (unlikely((unsigned long)dst & 6)) {
-		while (((unsigned long)dst & 6) && len >= 2) {
-			__u16 val16 = *(__u16 *)src;
-
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			unsafe_put_user(val16, (__u16 __user *)dst, out);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-
-	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, &err);
-	user_access_end();
-	return err ? 0 : ret;
-out:
+	sum = csum_partial_copy_generic(src, (void __force *)dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
@@ -125,7 +71,7 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 18/18] ppc: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                     ` (15 preceding siblings ...)
  2020-07-21 20:25   ` [PATCH 17/18] amd64: switch csum_partial_copy_generic() to new calling conventions Al Viro
@ 2020-07-21 20:25   ` Al Viro
  2020-07-21 20:25     ` Al Viro
  16 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of the pointless fallback in the wrappers.  On error it used
to zero the unwritten area and calculate the csum of the entire thing.  Not
wanting to do it in assembler part had been very reasonable; doing that in
the first place, OTOH...  In case of an error the caller discards the data
we'd copied, along with whatever checksum it might've had.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/powerpc/include/asm/checksum.h  |  6 +--
 arch/powerpc/lib/checksum_32.S       | 74 +++++++++++++-----------------------
 arch/powerpc/lib/checksum_64.S       | 37 ++++++------------
 arch/powerpc/lib/checksum_wrappers.c | 32 +++-------------
 4 files changed, 46 insertions(+), 103 deletions(-)

diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 97343e1a7d1c..fd0e4d1356a2 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -18,9 +18,7 @@
  * Like csum_partial, this must be called with even lengths,
  * except for the last fragment.
  */
-extern __wsum csum_partial_copy_generic(const void *src, void *dst,
-					      int len, __wsum sum,
-					      int *src_err, int *dst_err);
+extern __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
@@ -30,7 +28,7 @@ extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len);
 
 #define csum_partial_copy_nocheck(src, dst, len)   \
-        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
+        csum_partial_copy_generic((src), (dst), (len))
 
 
 /*
diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
index ecd150dc3ed9..ec5cd2dede35 100644
--- a/arch/powerpc/lib/checksum_32.S
+++ b/arch/powerpc/lib/checksum_32.S
@@ -78,12 +78,10 @@ EXPORT_SYMBOL(__csum_partial)
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively, and (for an error on
- * src) zeroes the rest of dst.
+ * and adds in 0xffffffff, while copying the block to dst.
+ * If an access exception occurs it returns zero.
  *
- * csum_partial_copy_generic(src, dst, len, sum, src_err, dst_err)
+ * csum_partial_copy_generic(src, dst, len)
  */
 #define CSUM_COPY_16_BYTES_WITHEX(n)	\
 8 ## n ## 0:			\
@@ -108,14 +106,14 @@ EXPORT_SYMBOL(__csum_partial)
 	adde	r12,r12,r10
 
 #define CSUM_COPY_16_BYTES_EXCODE(n)		\
-	EX_TABLE(8 ## n ## 0b, src_error);	\
-	EX_TABLE(8 ## n ## 1b, src_error);	\
-	EX_TABLE(8 ## n ## 2b, src_error);	\
-	EX_TABLE(8 ## n ## 3b, src_error);	\
-	EX_TABLE(8 ## n ## 4b, dst_error);	\
-	EX_TABLE(8 ## n ## 5b, dst_error);	\
-	EX_TABLE(8 ## n ## 6b, dst_error);	\
-	EX_TABLE(8 ## n ## 7b, dst_error);
+	EX_TABLE(8 ## n ## 0b, fault);	\
+	EX_TABLE(8 ## n ## 1b, fault);	\
+	EX_TABLE(8 ## n ## 2b, fault);	\
+	EX_TABLE(8 ## n ## 3b, fault);	\
+	EX_TABLE(8 ## n ## 4b, fault);	\
+	EX_TABLE(8 ## n ## 5b, fault);	\
+	EX_TABLE(8 ## n ## 6b, fault);	\
+	EX_TABLE(8 ## n ## 7b, fault);
 
 	.text
 	.stabs	"arch/powerpc/lib/",N_SO,0,0,0f
@@ -127,11 +125,8 @@ LG_CACHELINE_BYTES = L1_CACHE_SHIFT
 CACHELINE_MASK = (L1_CACHE_BYTES-1)
 
 _GLOBAL(csum_partial_copy_generic)
-	stwu	r1,-16(r1)
-	stw	r7,12(r1)
-	stw	r8,8(r1)
-
-	addic	r12,r6,0
+	li	r12,-1
+	addic	r0,r0,0			/* clear carry */
 	addi	r6,r4,-4
 	neg	r0,r4
 	addi	r4,r3,-4
@@ -246,34 +241,19 @@ _GLOBAL(csum_partial_copy_generic)
 	rlwinm	r3,r3,8,0,31	/* odd destination address: rotate one byte */
 	blr
 
-/* read fault */
-src_error:
-	lwz	r7,12(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r7,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r7)
-	blr
-/* write fault */
-dst_error:
-	lwz	r8,8(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r8,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r8)
+fault:
+	li	r3,0
 	blr
 
-	EX_TABLE(70b, src_error);
-	EX_TABLE(71b, dst_error);
-	EX_TABLE(72b, src_error);
-	EX_TABLE(73b, dst_error);
-	EX_TABLE(54b, dst_error);
+	EX_TABLE(70b, fault);
+	EX_TABLE(71b, fault);
+	EX_TABLE(72b, fault);
+	EX_TABLE(73b, fault);
+	EX_TABLE(54b, fault);
 
 /*
  * this stuff handles faults in the cacheline loop and branches to either
- * src_error (if in read part) or dst_error (if in write part)
+ * fault (if in read part) or fault (if in write part)
  */
 	CSUM_COPY_16_BYTES_EXCODE(0)
 #if L1_CACHE_BYTES >= 32
@@ -290,12 +270,12 @@ dst_error:
 #endif
 #endif
 
-	EX_TABLE(30b, src_error);
-	EX_TABLE(31b, dst_error);
-	EX_TABLE(40b, src_error);
-	EX_TABLE(41b, dst_error);
-	EX_TABLE(50b, src_error);
-	EX_TABLE(51b, dst_error);
+	EX_TABLE(30b, fault);
+	EX_TABLE(31b, fault);
+	EX_TABLE(40b, fault);
+	EX_TABLE(41b, fault);
+	EX_TABLE(50b, fault);
+	EX_TABLE(51b, fault);
 
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
diff --git a/arch/powerpc/lib/checksum_64.S b/arch/powerpc/lib/checksum_64.S
index 514978f908d4..98ff51bd2f7d 100644
--- a/arch/powerpc/lib/checksum_64.S
+++ b/arch/powerpc/lib/checksum_64.S
@@ -182,34 +182,33 @@ EXPORT_SYMBOL(__csum_partial)
 
 	.macro srcnr
 100:
-	EX_TABLE(100b,.Lsrc_error_nr)
+	EX_TABLE(100b,.Lerror_nr)
 	.endm
 
 	.macro source
 150:
-	EX_TABLE(150b,.Lsrc_error)
+	EX_TABLE(150b,.Lerror)
 	.endm
 
 	.macro dstnr
 200:
-	EX_TABLE(200b,.Ldest_error_nr)
+	EX_TABLE(200b,.Lerror_nr)
 	.endm
 
 	.macro dest
 250:
-	EX_TABLE(250b,.Ldest_error)
+	EX_TABLE(250b,.Lerror)
 	.endm
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively. The caller must take any action
- * required in this case (zeroing memory, recalculating partial checksum etc).
+ * and adds in 0xffffffff (32-bit), while copying the block to dst.
+ * If an access exception occurs, it returns 0.
  *
- * csum_partial_copy_generic(r3=src, r4=dst, r5=len, r6=sum, r7=src_err, r8=dst_err)
+ * csum_partial_copy_generic(r3=src, r4=dst, r5=len)
  */
 _GLOBAL(csum_partial_copy_generic)
+	li	r6,-1
 	addic	r0,r6,0			/* clear carry */
 
 	srdi.	r6,r5,3			/* less than 8 bytes? */
@@ -401,29 +400,15 @@ dstnr;	stb	r6,0(r4)
 	srdi	r3,r3,32
 	blr
 
-.Lsrc_error:
+.Lerror:
 	ld	r14,STK_REG(R14)(r1)
 	ld	r15,STK_REG(R15)(r1)
 	ld	r16,STK_REG(R16)(r1)
 	addi	r1,r1,STACKFRAMESIZE
-.Lsrc_error_nr:
-	cmpdi	0,r7,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r7)
+.Lerror_nr:
+	li	r3,0
 	blr
 
-.Ldest_error:
-	ld	r14,STK_REG(R14)(r1)
-	ld	r15,STK_REG(R15)(r1)
-	ld	r16,STK_REG(R16)(r1)
-	addi	r1,r1,STACKFRAMESIZE
-.Ldest_error_nr:
-	cmpdi	0,r8,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r8)
-	blr
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
 /*
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index b1faa82dd8af..b895166afc82 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -14,8 +14,7 @@
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 			       int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 
@@ -24,27 +23,16 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 
 	allow_read_from_user(src, len);
 
-	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, ~0U, &err, NULL);
-
-	if (unlikely(err)) {
-		int missing = __copy_from_user(dst, src, len);
-
-		if (missing)
-			csum = 0;
-		else
-			csum = csum_partial(dst, len, ~0U);
-	}
+	csum = csum_partial_copy_generic((void __force *)src, dst, len);
 
 	prevent_read_from_user(src, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 	if (unlikely(!access_ok(dst, len)))
@@ -52,17 +40,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 
 	allow_write_to_user(dst, len);
 
-	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, ~0U, NULL, &err);
-
-	if (unlikely(err)) {
-		csum = csum_partial(src, len, ~0U);
-
-		if (copy_to_user(dst, src, len))
-			csum = 0;
-	}
+	csum = csum_partial_copy_generic(src, (void __force *)dst, len);
 
 	prevent_write_to_user(dst, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH 18/18] ppc: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-21 20:25   ` [PATCH 18/18] ppc: propagate the calling conventions change down to csum_partial_copy_generic() Al Viro
@ 2020-07-21 20:25     ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 20:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of the pointless fallback in the wrappers.  On error it used
to zero the unwritten area and calculate the csum of the entire thing.  Not
wanting to do it in assembler part had been very reasonable; doing that in
the first place, OTOH...  In case of an error the caller discards the data
we'd copied, along with whatever checksum it might've had.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/powerpc/include/asm/checksum.h  |  6 +--
 arch/powerpc/lib/checksum_32.S       | 74 +++++++++++++-----------------------
 arch/powerpc/lib/checksum_64.S       | 37 ++++++------------
 arch/powerpc/lib/checksum_wrappers.c | 32 +++-------------
 4 files changed, 46 insertions(+), 103 deletions(-)

diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 97343e1a7d1c..fd0e4d1356a2 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -18,9 +18,7 @@
  * Like csum_partial, this must be called with even lengths,
  * except for the last fragment.
  */
-extern __wsum csum_partial_copy_generic(const void *src, void *dst,
-					      int len, __wsum sum,
-					      int *src_err, int *dst_err);
+extern __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
@@ -30,7 +28,7 @@ extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len);
 
 #define csum_partial_copy_nocheck(src, dst, len)   \
-        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
+        csum_partial_copy_generic((src), (dst), (len))
 
 
 /*
diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
index ecd150dc3ed9..ec5cd2dede35 100644
--- a/arch/powerpc/lib/checksum_32.S
+++ b/arch/powerpc/lib/checksum_32.S
@@ -78,12 +78,10 @@ EXPORT_SYMBOL(__csum_partial)
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively, and (for an error on
- * src) zeroes the rest of dst.
+ * and adds in 0xffffffff, while copying the block to dst.
+ * If an access exception occurs it returns zero.
  *
- * csum_partial_copy_generic(src, dst, len, sum, src_err, dst_err)
+ * csum_partial_copy_generic(src, dst, len)
  */
 #define CSUM_COPY_16_BYTES_WITHEX(n)	\
 8 ## n ## 0:			\
@@ -108,14 +106,14 @@ EXPORT_SYMBOL(__csum_partial)
 	adde	r12,r12,r10
 
 #define CSUM_COPY_16_BYTES_EXCODE(n)		\
-	EX_TABLE(8 ## n ## 0b, src_error);	\
-	EX_TABLE(8 ## n ## 1b, src_error);	\
-	EX_TABLE(8 ## n ## 2b, src_error);	\
-	EX_TABLE(8 ## n ## 3b, src_error);	\
-	EX_TABLE(8 ## n ## 4b, dst_error);	\
-	EX_TABLE(8 ## n ## 5b, dst_error);	\
-	EX_TABLE(8 ## n ## 6b, dst_error);	\
-	EX_TABLE(8 ## n ## 7b, dst_error);
+	EX_TABLE(8 ## n ## 0b, fault);	\
+	EX_TABLE(8 ## n ## 1b, fault);	\
+	EX_TABLE(8 ## n ## 2b, fault);	\
+	EX_TABLE(8 ## n ## 3b, fault);	\
+	EX_TABLE(8 ## n ## 4b, fault);	\
+	EX_TABLE(8 ## n ## 5b, fault);	\
+	EX_TABLE(8 ## n ## 6b, fault);	\
+	EX_TABLE(8 ## n ## 7b, fault);
 
 	.text
 	.stabs	"arch/powerpc/lib/",N_SO,0,0,0f
@@ -127,11 +125,8 @@ LG_CACHELINE_BYTES = L1_CACHE_SHIFT
 CACHELINE_MASK = (L1_CACHE_BYTES-1)
 
 _GLOBAL(csum_partial_copy_generic)
-	stwu	r1,-16(r1)
-	stw	r7,12(r1)
-	stw	r8,8(r1)
-
-	addic	r12,r6,0
+	li	r12,-1
+	addic	r0,r0,0			/* clear carry */
 	addi	r6,r4,-4
 	neg	r0,r4
 	addi	r4,r3,-4
@@ -246,34 +241,19 @@ _GLOBAL(csum_partial_copy_generic)
 	rlwinm	r3,r3,8,0,31	/* odd destination address: rotate one byte */
 	blr
 
-/* read fault */
-src_error:
-	lwz	r7,12(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r7,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r7)
-	blr
-/* write fault */
-dst_error:
-	lwz	r8,8(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r8,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r8)
+fault:
+	li	r3,0
 	blr
 
-	EX_TABLE(70b, src_error);
-	EX_TABLE(71b, dst_error);
-	EX_TABLE(72b, src_error);
-	EX_TABLE(73b, dst_error);
-	EX_TABLE(54b, dst_error);
+	EX_TABLE(70b, fault);
+	EX_TABLE(71b, fault);
+	EX_TABLE(72b, fault);
+	EX_TABLE(73b, fault);
+	EX_TABLE(54b, fault);
 
 /*
  * this stuff handles faults in the cacheline loop and branches to either
- * src_error (if in read part) or dst_error (if in write part)
+ * fault (if in read part) or fault (if in write part)
  */
 	CSUM_COPY_16_BYTES_EXCODE(0)
 #if L1_CACHE_BYTES >= 32
@@ -290,12 +270,12 @@ dst_error:
 #endif
 #endif
 
-	EX_TABLE(30b, src_error);
-	EX_TABLE(31b, dst_error);
-	EX_TABLE(40b, src_error);
-	EX_TABLE(41b, dst_error);
-	EX_TABLE(50b, src_error);
-	EX_TABLE(51b, dst_error);
+	EX_TABLE(30b, fault);
+	EX_TABLE(31b, fault);
+	EX_TABLE(40b, fault);
+	EX_TABLE(41b, fault);
+	EX_TABLE(50b, fault);
+	EX_TABLE(51b, fault);
 
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
diff --git a/arch/powerpc/lib/checksum_64.S b/arch/powerpc/lib/checksum_64.S
index 514978f908d4..98ff51bd2f7d 100644
--- a/arch/powerpc/lib/checksum_64.S
+++ b/arch/powerpc/lib/checksum_64.S
@@ -182,34 +182,33 @@ EXPORT_SYMBOL(__csum_partial)
 
 	.macro srcnr
 100:
-	EX_TABLE(100b,.Lsrc_error_nr)
+	EX_TABLE(100b,.Lerror_nr)
 	.endm
 
 	.macro source
 150:
-	EX_TABLE(150b,.Lsrc_error)
+	EX_TABLE(150b,.Lerror)
 	.endm
 
 	.macro dstnr
 200:
-	EX_TABLE(200b,.Ldest_error_nr)
+	EX_TABLE(200b,.Lerror_nr)
 	.endm
 
 	.macro dest
 250:
-	EX_TABLE(250b,.Ldest_error)
+	EX_TABLE(250b,.Lerror)
 	.endm
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively. The caller must take any action
- * required in this case (zeroing memory, recalculating partial checksum etc).
+ * and adds in 0xffffffff (32-bit), while copying the block to dst.
+ * If an access exception occurs, it returns 0.
  *
- * csum_partial_copy_generic(r3=src, r4=dst, r5=len, r6=sum, r7=src_err, r8=dst_err)
+ * csum_partial_copy_generic(r3=src, r4=dst, r5=len)
  */
 _GLOBAL(csum_partial_copy_generic)
+	li	r6,-1
 	addic	r0,r6,0			/* clear carry */
 
 	srdi.	r6,r5,3			/* less than 8 bytes? */
@@ -401,29 +400,15 @@ dstnr;	stb	r6,0(r4)
 	srdi	r3,r3,32
 	blr
 
-.Lsrc_error:
+.Lerror:
 	ld	r14,STK_REG(R14)(r1)
 	ld	r15,STK_REG(R15)(r1)
 	ld	r16,STK_REG(R16)(r1)
 	addi	r1,r1,STACKFRAMESIZE
-.Lsrc_error_nr:
-	cmpdi	0,r7,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r7)
+.Lerror_nr:
+	li	r3,0
 	blr
 
-.Ldest_error:
-	ld	r14,STK_REG(R14)(r1)
-	ld	r15,STK_REG(R15)(r1)
-	ld	r16,STK_REG(R16)(r1)
-	addi	r1,r1,STACKFRAMESIZE
-.Ldest_error_nr:
-	cmpdi	0,r8,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r8)
-	blr
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
 /*
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index b1faa82dd8af..b895166afc82 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -14,8 +14,7 @@
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 			       int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 
@@ -24,27 +23,16 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 
 	allow_read_from_user(src, len);
 
-	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, ~0U, &err, NULL);
-
-	if (unlikely(err)) {
-		int missing = __copy_from_user(dst, src, len);
-
-		if (missing)
-			csum = 0;
-		else
-			csum = csum_partial(dst, len, ~0U);
-	}
+	csum = csum_partial_copy_generic((void __force *)src, dst, len);
 
 	prevent_read_from_user(src, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 	if (unlikely(!access_ok(dst, len)))
@@ -52,17 +40,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 
 	allow_write_to_user(dst, len);
 
-	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, ~0U, NULL, &err);
-
-	if (unlikely(err)) {
-		csum = csum_partial(src, len, ~0U);
-
-		if (copy_to_user(dst, src, len))
-			csum = 0;
-	}
+	csum = csum_partial_copy_generic(src, (void __force *)dst, len);
 
 	prevent_write_to_user(dst, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:25   ` [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
  2020-07-21 20:25     ` Al Viro
@ 2020-07-21 20:55     ` Linus Torvalds
  2020-07-21 20:58       ` Linus Torvalds
  2020-07-22  9:45       ` David Laight
  2020-07-22  9:27     ` David Laight
  2 siblings, 2 replies; 102+ messages in thread
From: Linus Torvalds @ 2020-07-21 20:55 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Tue, Jul 21, 2020 at 1:25 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Preparation for the change of calling conventions; right now all
> callers pass 0 as initial sum.  Passing 0xffffffff instead yields
> the values comparable mod 0xffff and guarantees that 0 will not
> be returned on success.

This seems dangerous to me.

Maybe some implementation depends on the fact that they actually do
the csum 16 bits at a time, and never see an overflow in "int",
because they keep folding things.

You now break that assumption, and give it an initial value that the
csum code itself would never generate, and wouldn't handle right.

But I didn't check. Maybe we don't have anything that stupid in the kernel.

              Linus

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:55     ` Linus Torvalds
@ 2020-07-21 20:58       ` Linus Torvalds
  2020-07-21 21:11         ` Al Viro
  2020-07-22  9:45       ` David Laight
  1 sibling, 1 reply; 102+ messages in thread
From: Linus Torvalds @ 2020-07-21 20:58 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Tue, Jul 21, 2020 at 1:55 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This seems dangerous to me.
>
> Maybe some implementation depends on the fact that they actually do
> the csum 16 bits at a time, and never see an overflow in "int",
> because they keep folding things.
>
> You now break that assumption, and give it an initial value that the
> csum code itself would never generate, and wouldn't handle right.
>
> But I didn't check. Maybe we don't have anything that stupid in the kernel.

I take it back. The very first place I looked seemed to do exactly that.

See "do_csum()" in the kernel. It doesn't handle carry for any of the
usual cases, exactly because it knows it doesn't need to.

Ok, so do_csum() doesn't take that initial value, but it's very much
an example of the kind of algorithm I was thinking of: it does do
things 32 bits at a time and handles the carry bit in that inner loop,
but internally it knows that the val;ues are limited in other places,
and doesn't need to handle carry everywhere.

                Linus

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:58       ` Linus Torvalds
@ 2020-07-21 21:11         ` Al Viro
  2020-07-21 21:16           ` Linus Torvalds
  2020-07-25 17:54           ` Al Viro
  0 siblings, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-21 21:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux Kernel Mailing List, linux-arch

On Tue, Jul 21, 2020 at 01:58:47PM -0700, Linus Torvalds wrote:
> On Tue, Jul 21, 2020 at 1:55 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > This seems dangerous to me.
> >
> > Maybe some implementation depends on the fact that they actually do
> > the csum 16 bits at a time, and never see an overflow in "int",
> > because they keep folding things.
> >
> > You now break that assumption, and give it an initial value that the
> > csum code itself would never generate, and wouldn't handle right.
> >
> > But I didn't check. Maybe we don't have anything that stupid in the kernel.

I did.

> I take it back. The very first place I looked seemed to do exactly that.
> 
> See "do_csum()" in the kernel. It doesn't handle carry for any of the
> usual cases, exactly because it knows it doesn't need to.
> 
> Ok, so do_csum() doesn't take that initial value, but it's very much
> an example of the kind of algorithm I was thinking of: it does do
> things 32 bits at a time and handles the carry bit in that inner loop,
> but internally it knows that the val;ues are limited in other places,
> and doesn't need to handle carry everywhere.

Theoretically - sure.  I can post the full analysis of that stuff (starting
with the proof that all instances of csum_partial() are OK in that respect,
which takes care of the default instances, then instance-by-instance
analysis of the rest); will need to collate the pieces, remove the actionable
obscenities, etc., but I have done that analysis.  Made for rather unpleasant
couple of weeks... ;-/

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 21:11         ` Al Viro
@ 2020-07-21 21:16           ` Linus Torvalds
  2020-07-21 21:16             ` Linus Torvalds
  2020-07-25 17:54           ` Al Viro
  1 sibling, 1 reply; 102+ messages in thread
From: Linus Torvalds @ 2020-07-21 21:16 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Tue, Jul 21, 2020 at 2:11 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > > But I didn't check. Maybe we don't have anything that stupid in the kernel.
>
> I did.

So then the commit message really should have said so, I feel. That
would have avoided the whole worry, and made it clear that it's not an
issue.

                Linus

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 21:16           ` Linus Torvalds
@ 2020-07-21 21:16             ` Linus Torvalds
  0 siblings, 0 replies; 102+ messages in thread
From: Linus Torvalds @ 2020-07-21 21:16 UTC (permalink / raw)
  To: Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

On Tue, Jul 21, 2020 at 2:11 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> > > But I didn't check. Maybe we don't have anything that stupid in the kernel.
>
> I did.

So then the commit message really should have said so, I feel. That
would have avoided the whole worry, and made it clear that it's not an
issue.

                Linus

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  2020-07-21 20:25   ` [PATCH 11/18] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
  2020-07-21 20:25     ` Al Viro
@ 2020-07-22  1:20     ` David Miller
  1 sibling, 0 replies; 102+ messages in thread
From: David Miller @ 2020-07-22  1:20 UTC (permalink / raw)
  To: viro; +Cc: torvalds, linux-kernel, linux-arch

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 21 Jul 2020 21:25:42 +0100

> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> ... and get rid of zeroing the target, etc. on fault.
> All exception handlers merge into one; moreover, since we are not
> calling lookup_fault() anymore, we don't need the magic with passing
> arguments for it from the page fault handler.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  2020-07-21 20:25   ` [PATCH 16/18] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
  2020-07-21 20:25     ` Al Viro
@ 2020-07-22  1:21     ` David Miller
  1 sibling, 0 replies; 102+ messages in thread
From: David Miller @ 2020-07-22  1:21 UTC (permalink / raw)
  To: viro; +Cc: torvalds, linux-kernel, linux-arch

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 21 Jul 2020 21:25:47 +0100

> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> ... and rename them into csum_and_copy_...() - the wrappers become pointless.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Acked-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 15/18] xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
  2020-07-21 20:25   ` [PATCH 15/18] xtensa: propagate the calling conventions change down into csum_partial_copy_generic() Al Viro
@ 2020-07-22  8:56     ` Max Filippov
  0 siblings, 0 replies; 102+ messages in thread
From: Max Filippov @ 2020-07-22  8:56 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, LKML, Linux-Arch

On Tue, Jul 21, 2020 at 1:27 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> From: Al Viro <viro@zeniv.linux.org.uk>
>
> turn the exception handlers into returning 0.
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  arch/xtensa/include/asm/checksum.h | 20 +++---------
>  arch/xtensa/lib/checksum.S         | 67 +++++++++-----------------------------
>  2 files changed, 19 insertions(+), 68 deletions(-)

Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>

-- 
Thanks.
-- Max

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:25   ` [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
  2020-07-21 20:25     ` Al Viro
  2020-07-21 20:55     ` Linus Torvalds
@ 2020-07-22  9:27     ` David Laight
  2020-07-22 14:42       ` Al Viro
  2 siblings, 1 reply; 102+ messages in thread
From: David Laight @ 2020-07-22  9:27 UTC (permalink / raw)
  To: 'Al Viro', Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro
> Sent: 21 July 2020 21:26
> Preparation for the change of calling conventions; right now all
> callers pass 0 as initial sum.  Passing 0xffffffff instead yields
> the values comparable mod 0xffff and guarantees that 0 will not
> be returned on success.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  lib/iov_iter.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 7405922caaec..d5b7e204fea6 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -1451,7 +1451,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
>  		int err = 0;
>  		next = csum_and_copy_from_user(v.iov_base,
>  					       (to += v.iov_len) - v.iov_len,
> -					       v.iov_len, 0, &err);
> +					       v.iov_len, ~0U, &err);
>  		if (!err) {
>  			sum = csum_block_add(sum, next, off);
>  			off += v.iov_len;

Can't you remove the csum_block_add() by passing the
old 'sum' in instead of the ~0U ?
You'll need to keep track of whether the buffer fragment
is odd/even aligned.
After an odd length fragment a bswap32() or 8 bit rotate will
fix things (and maybe one right at the end).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-21 20:55     ` Linus Torvalds
  2020-07-21 20:58       ` Linus Torvalds
@ 2020-07-22  9:45       ` David Laight
  1 sibling, 0 replies; 102+ messages in thread
From: David Laight @ 2020-07-22  9:45 UTC (permalink / raw)
  To: 'Linus Torvalds', Al Viro; +Cc: Linux Kernel Mailing List, linux-arch

From: Linus Torvalds
> Sent: 21 July 2020 21:55
> On Tue, Jul 21, 2020 at 1:25 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Preparation for the change of calling conventions; right now all
> > callers pass 0 as initial sum.  Passing 0xffffffff instead yields
> > the values comparable mod 0xffff and guarantees that 0 will not
> > be returned on success.
> 
> This seems dangerous to me.
> 
> Maybe some implementation depends on the fact that they actually do
> the csum 16 bits at a time, and never see an overflow in "int",
> because they keep folding things.
> 
> You now break that assumption, and give it an initial value that the
> csum code itself would never generate, and wouldn't handle right.
> 
> But I didn't check. Maybe we don't have anything that stupid in the kernel.

It isn't necessarily stupid :-)
A 64bit sum can be reduced to 16bits using shifts and adds
(as us usually done) of using 'sum % 0xffff'.
Provided the compiler uses 'multiply by reciprocal' the code
isn't that bad - it might even be difficult to say which is faster.
However that makes the output domain 0..fffe not 1..ffff.

The checksum generation code really needs to know which is used.
So it is best never to use the % version.
If the sum is known to be 1..0xffff then after inversion it is
0..fffe but the required domain is 1..ffff.
This can be fixed by adding 1 - provided a compensating 1 is
added in before the inversion.
The easy place to do this is to feed 1 (not 0 or ~0) into the
first checksum block.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22  9:27     ` David Laight
@ 2020-07-22 14:42       ` Al Viro
  2020-07-22 15:22         ` David Laight
  0 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-22 14:42 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Wed, Jul 22, 2020 at 09:27:32AM +0000, David Laight wrote:
> From: Al Viro
> > Sent: 21 July 2020 21:26
> > Preparation for the change of calling conventions; right now all
> > callers pass 0 as initial sum.  Passing 0xffffffff instead yields
> > the values comparable mod 0xffff and guarantees that 0 will not
> > be returned on success.
> > 
> > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > ---
> >  lib/iov_iter.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> > index 7405922caaec..d5b7e204fea6 100644
> > --- a/lib/iov_iter.c
> > +++ b/lib/iov_iter.c
> > @@ -1451,7 +1451,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
> >  		int err = 0;
> >  		next = csum_and_copy_from_user(v.iov_base,
> >  					       (to += v.iov_len) - v.iov_len,
> > -					       v.iov_len, 0, &err);
> > +					       v.iov_len, ~0U, &err);
> >  		if (!err) {
> >  			sum = csum_block_add(sum, next, off);
> >  			off += v.iov_len;
> 
> Can't you remove the csum_block_add() by passing the
> old 'sum' in instead of the ~0U ?
> You'll need to keep track of whether the buffer fragment
> is odd/even aligned.
> After an odd length fragment a bswap32() or 8 bit rotate will
> fix things (and maybe one right at the end).

And the benefit of that would be...?  It wouldn't be any simpler,
it almost certainly would not even be a valid microoptimization
(nevermind that this is an arch-independent code)...

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22 14:42       ` Al Viro
@ 2020-07-22 15:22         ` David Laight
  2020-07-22 15:54           ` Al Viro
  0 siblings, 1 reply; 102+ messages in thread
From: David Laight @ 2020-07-22 15:22 UTC (permalink / raw)
  To: 'Al Viro'; +Cc: Linus Torvalds, linux-kernel, linux-arch

From: Al Viro
> Sent: 22 July 2020 15:42
> 
> On Wed, Jul 22, 2020 at 09:27:32AM +0000, David Laight wrote:
> > From: Al Viro
> > > Sent: 21 July 2020 21:26
> > > Preparation for the change of calling conventions; right now all
> > > callers pass 0 as initial sum.  Passing 0xffffffff instead yields
> > > the values comparable mod 0xffff and guarantees that 0 will not
> > > be returned on success.
> > >
> > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> > > ---
> > >  lib/iov_iter.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> > > index 7405922caaec..d5b7e204fea6 100644
> > > --- a/lib/iov_iter.c
> > > +++ b/lib/iov_iter.c
> > > @@ -1451,7 +1451,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
> > >  		int err = 0;
> > >  		next = csum_and_copy_from_user(v.iov_base,
> > >  					       (to += v.iov_len) - v.iov_len,
> > > -					       v.iov_len, 0, &err);
> > > +					       v.iov_len, ~0U, &err);
> > >  		if (!err) {
> > >  			sum = csum_block_add(sum, next, off);
> > >  			off += v.iov_len;
> >
> > Can't you remove the csum_block_add() by passing the
> > old 'sum' in instead of the ~0U ?
> > You'll need to keep track of whether the buffer fragment
> > is odd/even aligned.
> > After an odd length fragment a bswap32() or 8 bit rotate will
> > fix things (and maybe one right at the end).
> 
> And the benefit of that would be...?  It wouldn't be any simpler,
> it almost certainly would not even be a valid microoptimization
> (nevermind that this is an arch-independent code)...

It ought to give a minor improvement because it saves the extra
csum_fold() when the checksum from a buffer is added to the
previous total.

On 64bit systems there are even advantages in passing in a 64bit
value - so the caller can add many 32bit values together.
If nothing else it lets you use a '<< 8' if the previous fragment
had an odd length.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22 15:22         ` David Laight
@ 2020-07-22 15:54           ` Al Viro
  2020-07-22 16:17             ` David Laight
  0 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-22 15:54 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Wed, Jul 22, 2020 at 03:22:45PM +0000, David Laight wrote:

> > And the benefit of that would be...?  It wouldn't be any simpler,
> > it almost certainly would not even be a valid microoptimization
> > (nevermind that this is an arch-independent code)...
> 
> It ought to give a minor improvement because it saves the extra
> csum_fold() when the checksum from a buffer is added to the
> previous total.
> 

Sigh...  _WHAT_ csum_fold()?

static inline __wsum
csum_block_add(__wsum csum, __wsum csum2, int offset)
{
        u32 sum = (__force u32)csum2;

        /* rotate sum to align it with a 16b boundary */
        if (offset & 1)
                sum = ror32(sum, 8);

        return csum_add(csum, (__force __wsum)sum);
}

David, do you *ever* bother to RTFS?  I mean, competent supercilious twits
are annoying, but at least with those you can generally assume that what
they say makes sense and has some relation to reality.  You, OTOH, keep
spewing utter bollocks, without ever lowering yourself to checking if your
guesses have anything to do with the reality.  With supercilious twit part
proudly on the display - you do speak with confidence, and the way you
dispense the oh-so-valuable advice to everyone around...

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22 15:54           ` Al Viro
@ 2020-07-22 16:17             ` David Laight
  2020-07-22 17:39               ` Al Viro
  0 siblings, 1 reply; 102+ messages in thread
From: David Laight @ 2020-07-22 16:17 UTC (permalink / raw)
  To: 'Al Viro'; +Cc: Linus Torvalds, linux-kernel, linux-arch

From: Al Viro > Sent: 22 July 2020 16:55
> To: David Laight <David.Laight@ACULAB.COM>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>; linux-kernel@vger.kernel.org; linux-
> arch@vger.kernel.org
> Subject: Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
> 
> On Wed, Jul 22, 2020 at 03:22:45PM +0000, David Laight wrote:
> 
> > > And the benefit of that would be...?  It wouldn't be any simpler,
> > > it almost certainly would not even be a valid microoptimization
> > > (nevermind that this is an arch-independent code)...
> >
> > It ought to give a minor improvement because it saves the extra
> > csum_fold() when the checksum from a buffer is added to the
> > previous total.
> >
> 
> Sigh...  _WHAT_ csum_fold()?
> 
> static inline __wsum
> csum_block_add(__wsum csum, __wsum csum2, int offset)
> {
>         u32 sum = (__force u32)csum2;
> 
>         /* rotate sum to align it with a 16b boundary */
>         if (offset & 1)
>                 sum = ror32(sum, 8);
> 
>         return csum_add(csum, (__force __wsum)sum);
> }
> 
> David, do you *ever* bother to RTFS?  I mean, competent supercilious twits
> are annoying, but at least with those you can generally assume that what
> they say makes sense and has some relation to reality.  You, OTOH, keep
> spewing utter bollocks, without ever lowering yourself to checking if your
> guesses have anything to do with the reality.  With supercilious twit part
> proudly on the display - you do speak with confidence, and the way you
> dispense the oh-so-valuable advice to everyone around...

Yes, I do look at the code.
I've actually spent a lot of time looking at the x86 checksum code.
I've posted a patch for a version that is about twice as fast as the
current one on a large range of x86 cpus.

Possibly I meant the 32bit reduction inside csum_add()
rather than what csum_fold() does.

Having worked on the internals of SYSV, NetBSD and Linux I probably
forget the exact names for a few things.
The brain can only hold so much information.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22 16:17             ` David Laight
@ 2020-07-22 17:39               ` Al Viro
  2020-07-23  8:29                 ` David Laight
  2020-07-23 13:54                 ` David Laight
  0 siblings, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-22 17:39 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Wed, Jul 22, 2020 at 04:17:02PM +0000, David Laight wrote:
> > David, do you *ever* bother to RTFS?  I mean, competent supercilious twits
> > are annoying, but at least with those you can generally assume that what
> > they say makes sense and has some relation to reality.  You, OTOH, keep
> > spewing utter bollocks, without ever lowering yourself to checking if your
> > guesses have anything to do with the reality.  With supercilious twit part
> > proudly on the display - you do speak with confidence, and the way you
> > dispense the oh-so-valuable advice to everyone around...
> 
> Yes, I do look at the code.
> I've actually spent a lot of time looking at the x86 checksum code.
> I've posted a patch for a version that is about twice as fast as the
> current one on a large range of x86 cpus.
> 
> Possibly I meant the 32bit reduction inside csum_add()
> rather than what csum_fold() does.

Really?
static inline unsigned add32_with_carry(unsigned a, unsigned b)
{  
        asm("addl %2,%0\n\t"
            "adcl $0,%0"
            : "=r" (a)
            : "0" (a), "rm" (b));
        return a;
}
static inline __wsum csum_add(__wsum csum, __wsum addend)
{
        return (__force __wsum)add32_with_carry((__force unsigned)csum,
                                                (__force unsigned)addend);
}

I would love to see your patch, anyway, along with the testcases and performance
comparison.

> Having worked on the internals of SYSV, NetBSD and Linux I probably
> forget the exact names for a few things.

That's usually dealt with by a few minutes with grep and vi...

> The brain can only hold so much information.

Bravo.  "I can't be arsed to check anything" spun into the claim of one's
superior experience.

What it means in practice is that your output is so much garbage that _might_
be untangled into something meaningful if the reader manages to guess the
substitutions.  Provided that the reconstruction won't not turn out to be
a composite of things applying to different versions of different kernels,
not being valid for any of those, that is...

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22 17:39               ` Al Viro
@ 2020-07-23  8:29                 ` David Laight
  2020-07-23 13:54                 ` David Laight
  1 sibling, 0 replies; 102+ messages in thread
From: David Laight @ 2020-07-23  8:29 UTC (permalink / raw)
  To: 'Al Viro'; +Cc: Linus Torvalds, linux-kernel, linux-arch

From: Al Viro
> Sent: 22 July 2020 18:39
> On Wed, Jul 22, 2020 at 04:17:02PM +0000, David Laight wrote:
> > > David, do you *ever* bother to RTFS?  I mean, competent supercilious twits
> > > are annoying, but at least with those you can generally assume that what
> > > they say makes sense and has some relation to reality.  You, OTOH, keep
> > > spewing utter bollocks, without ever lowering yourself to checking if your
> > > guesses have anything to do with the reality.  With supercilious twit part
> > > proudly on the display - you do speak with confidence, and the way you
> > > dispense the oh-so-valuable advice to everyone around...
> >
> > Yes, I do look at the code.
> > I've actually spent a lot of time looking at the x86 checksum code.
> > I've posted a patch for a version that is about twice as fast as the
> > current one on a large range of x86 cpus.
> >
> > Possibly I meant the 32bit reduction inside csum_add()
> > rather than what csum_fold() does.
> 
> Really?
> static inline unsigned add32_with_carry(unsigned a, unsigned b)
> {
>         asm("addl %2,%0\n\t"
>             "adcl $0,%0"
>             : "=r" (a)
>             : "0" (a), "rm" (b));
>         return a;
> }

I agree it isn't much, but both those instructions almost certainly
get replicated with the initial value fed into the checksum function.

Everything except x86, sparc/64 and powerpc/64 uses the C code
from include/net/checksum.h which is the longer sequences:
	csum += addend;
	csum += csum < addend;
That's three instructions on something like MIPS - not too bad.
I'm not sure about ARM - ARM could probably use adc.
Some architectures may end up with an actual conditional jump.

Quite how the instructions get scheduled probably makes more
difference.
The sequence is a register dependency chain, and the checksum
register could easily be limiting the execution speed.
On x86 the 'adc' loop runs at two clocks per adc on a wide
range of Intel cpus.

Actually there is lot more to be gained in the code that reads
the iovec[] from userspace.
The calling sequences for the two nexted functions used are horrid.
Fixing that does make a measurable difference to semdmsg().

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-22 17:39               ` Al Viro
  2020-07-23  8:29                 ` David Laight
@ 2020-07-23 13:54                 ` David Laight
  2020-07-23 14:30                   ` David Laight
  2020-07-23 14:53                   ` Al Viro
  1 sibling, 2 replies; 102+ messages in thread
From: David Laight @ 2020-07-23 13:54 UTC (permalink / raw)
  To: 'Al Viro'; +Cc: Linus Torvalds, linux-kernel, linux-arch


[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]

From: Al Viro
> Sent: 22 July 2020 18:39
> I would love to see your patch, anyway, along with the testcases and performance
> comparison.

See attached program.
Compile and run (as root): csum_iov 1

Unpatched (as shipped) 16 vectors of 1 byte take ~430 clocks on my haswell cpu.
With dsl_patch defined they take ~393.

The maximum throughput is ~1.16 clocks/word for 16 vectors of 1k.
For longer vectors the data gets lost from the cache between the iterations.

On an older Ivy Bridge cpu it never goes faster than 2 clocks/word.
(Due to the implementation of ADC.)

The absolute limit is 1 clock/word - limited by the memory write.
I suspect that is achievable on Haswell with much less loop unrolling.

I had to replace the ror32() with __builtin_bswap32().
The kernel object do contain the 'ror' instruction - even though I
didn't find the asm for it.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

[-- Attachment #2: csum_iov.c --]
[-- Type: text/plain, Size: 7804 bytes --]

/* Test program for checksum+copy
 *
 * Executes csum_and_copy_from_iter() in userspace.
 * Uses PERF_COUNT_HW_CPU_CYCLES to see how fast it runs.
 * Always copies i6 copies of the same buffer to the target.
 * Length of each fragment taken from argv[0].
 *
 * It needs linking with a copy of csum-copy_64.o (eg from a kernel build).
 *
 * For large buffers the 'adc' loop dominates.
 * On anything prior to Haswell this is 2 clocks per adc.
 * On Haswell adc is faster and it seems to approach 1.16 clocks/word.
 * It ought to be possibly to get to 1 clock/word on Ivy bridge (Sandy?)
 * or later.
 */
// define for my version
// #define dsl_patch

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>

#include <linux/perf_event.h>
#include <sys/mman.h>
#include <sys/syscall.h>

#define likely(x) (x)
#define unlikely(x) (x)

typedef uint32_t __wsum;

struct kvec {
	size_t iov_len;
	void   *iov_base;
};

struct iov_iter {
	unsigned int count;
	unsigned int nr_segs;
	const struct kvec *kvec;
	size_t       iov_offset;
};

#define min(a,b) ((a) < (b) ? (a) : (b))

static unsigned short fold(unsigned int csum)
{
	csum = (csum & 0xffff) + (csum >> 16);
	return csum + (csum >> 16);
}

extern __wsum csum_partial_copy_generic(const void *, void *, size_t, __wsum, void *, void *);


__wsum
csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
{
        return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
}

static inline unsigned add32_with_carry(unsigned a, unsigned b)
{
	asm("addl %2,%0\n\t"
	    "adcl $0,%0"
	    : "=r" (a)
	    : "0" (a), "rm" (b));
	return a;
}


static inline __wsum csum_add(__wsum csum, __wsum addend)
{
	return add32_with_carry(csum, addend);
}

static inline __wsum
csum_block_add(__wsum csum, __wsum sum, int offset)
{
        /* rotate sum to align it with a 16b boundary */
        if (offset & 1)
                sum = __builtin_bswap32(sum);

        return csum_add(csum, sum);
}
//////////////////////////////////////////////////////////////////////

/* Necessary bits from iov_iter.c */

#define iterate_kvec(i, n, __v, __p, skip, STEP) {	\
	size_t wanted = n;				\
	__p = i->kvec;					\
	__v.iov_len = min(n, __p->iov_len - skip);	\
	if (likely(__v.iov_len)) {			\
		__v.iov_base = __p->iov_base + skip;	\
		(void)(STEP);				\
		skip += __v.iov_len;			\
		n -= __v.iov_len;			\
	}						\
	while (unlikely(n)) {				\
		__p++;					\
		__v.iov_len = min(n, __p->iov_len);	\
		if (unlikely(!__v.iov_len))		\
			continue;			\
		__v.iov_base = __p->iov_base;		\
		(void)(STEP);				\
		skip = __v.iov_len;			\
		n -= __v.iov_len;			\
	}						\
	n = wanted;					\
}



#define iterate_and_advance(i, n, v, I, B, K) {			\
	if (unlikely(i->count < n))				\
		n = i->count;					\
	if (i->count) {						\
		size_t skip = i->iov_offset;			\
			const struct kvec *kvec;		\
			struct kvec v;				\
			iterate_kvec(i, n, v, kvec, skip, (K))	\
			if (skip == kvec->iov_len) {		\
				kvec++;				\
				skip = 0;			\
			}					\
			i->nr_segs -= kvec - i->kvec;		\
			i->kvec = kvec;				\
		i->count -= n;					\
		i->iov_offset = skip;				\
	}							\
}


static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
			      __wsum sum, size_t off)
{
#ifdef dsl_patch
	return csum_partial_copy_nocheck(from, to, len, sum);
#else
	__wsum next = csum_partial_copy_nocheck(from, to, len, 0);
	return csum_block_add(sum, next, off);
#endif
}



size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
			       struct iov_iter *i)
{
	char *to = addr;
	__wsum sum, next;
	size_t off = 0;
	sum = *csum;
	iterate_and_advance(i, bytes, v, , ,({
		sum = csum_and_memcpy((to += v.iov_len) - v.iov_len,
				      v.iov_base, v.iov_len,
				      sum, off);
		off += v.iov_len;
#ifdef dsl_patch
		if (v.iov_len & 1)
			sum = __builtin_bswap32(sum);
#endif
			
	})
	)
#ifdef dsl_patch
	if (off & 1)
		sum = __builtin_bswap32(sum);
#endif
	*csum = sum;
	return bytes;
}

//////////////////////////////////////////////////////////////////////

void ex_handler_uaccess(void) { }
void ex_handler_default(void) { }

static char data[65536] = {

0x46,0x56,0x20,0x04,0x00,0x02,0x00,0x00,0x72,0x4d,0xc6,0x3d,0x31,0x85,0x2d,0xbd,
0xe2,0xe0,0x9d,0x3e,0x3b,0x7a,0x70,0x3d,0xd2,0xfb,0x8c,0xbf,0x95,0x10,0xa9,0xbe,
0xeb,0xfd,0x29,0x40,0xd5,0x7a,0x61,0x40,0xde,0xcd,0x14,0xbf,0x81,0x1b,0xf6,0x3f,
0xbc,0xff,0x17,0x3f,0x67,0x1c,0x6e,0xbe,0xf4,0xc2,0x05,0x40,0x0b,0x13,0x78,0x3f,
0xfe,0x47,0xa7,0xbd,0x59,0xc2,0x15,0x3f,0x07,0xd0,0xea,0xbf,0x97,0xf1,0x3c,0x3f,
0xcc,0xfa,0x6b,0x40,0x72,0x6a,0x4f,0xbe,0x0b,0xe3,0x75,0x3e,0x3c,0x9b,0x0e,0xbf,
0xa9,0xeb,0xb7,0x3f,0xeb,0x4a,0xec,0x3e,0x33,0x8c,0x0c,0x3f,0x6a,0xf2,0xf3,0x3e,
0x2b,0x45,0x86,0x3f,0x83,0xce,0x8a,0x3f,0xf6,0x01,0x16,0x40,0x9c,0x17,0x47,0x3e,
0x44,0x83,0x61,0x40,0x74,0xc7,0x5c,0x3f,0xec,0xe7,0x95,0x3f,0xee,0x19,0xb5,0xbf,
0xb5,0xf0,0x03,0xbf,0xd1,0x02,0x1c,0x3e,0xa3,0x55,0x90,0xbe,0x1e,0x0b,0xa1,0xbf,
0xa4,0xa8,0xb4,0x3f,0xc6,0x68,0x91,0x3f,0xd1,0xc5,0xab,0x3f,0xb9,0x14,0x62,0x3f,
0x7c,0xe0,0xb9,0xbf,0xc0,0xa4,0xb5,0x3d,0x6f,0xd9,0xa7,0x3f,0x8f,0xc4,0xb0,0x3d,
0x48,0x2c,0x7a,0x3e,0x83,0xb2,0x3c,0x40,0x36,0xd3,0x18,0x40,0xb7,0xa9,0x57,0x40,
0xda,0xd3,0x95,0x3f,0x74,0x95,0xc0,0xbe,0xbb,0xce,0x71,0x3e,0x95,0xec,0x18,0xbf,
0x94,0x17,0xdd,0x3f,0x98,0xa5,0x02,0x3f,0xbb,0xfb,0xbb,0x3e,0xd0,0x5a,0x9c,0x3f,
0xd4,0x00,0x9b,0xbf,0x3b,0x9f,0x20,0xc0,0x84,0x5b,0x0f,0x40,0x5e,0x48,0x2c,0xbf,

};

#if 0
struct kvec {
	size_t iov_len;
	void   *iov_base;
};

struct iov_iter {
	unsigned int count;
	unsigned int nr_segs;
	const struct kvec *kvec;
	size_t       iov_offset;
};
#endif

static inline unsigned int rdpmc(unsigned int counter)
{
	unsigned int low, high;

	asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));

	// return low bits, counter might to 32 or 40 bits wide.
	return low;
}

unsigned int read_cpu_cycles(void)
{
	static struct perf_event_attr perf_attr = {
		.type = PERF_TYPE_HARDWARE,
		.config = PERF_COUNT_HW_CPU_CYCLES,
		// .config = PERF_COUNT_HW_INSTRUCTIONS,
		.pinned = 1,
	};
	static struct perf_event_mmap_page *pc;
	unsigned int seq, idx, count;

	if (!pc) {
		int perf_fd;
		perf_fd = syscall(__NR_perf_event_open, &perf_attr, 0, -1, -1, 0);
		if (perf_fd < 0) {
			fprintf(stderr, "perf_event_open failed: errno %d\n", errno);
			exit(1);
		}
		pc = mmap(NULL, 4096, PROT_READ, MAP_SHARED, perf_fd, 0);
		if (pc == MAP_FAILED) {
			fprintf(stderr, "perf_event mmap() failed: errno %d\n", errno);
			exit(1);
		}
	}

	do {
		seq = pc->lock;
		asm volatile("":::"memory");
		idx = pc->index;
		if (!idx) //  || !pc->cap_user_rdpmc)
			return 0;
		count = pc->offset + rdpmc(idx - 1);
		asm volatile("":::"memory");
	} while (pc->lock != seq);

	return count;
}


static int target[16 * sizeof data / 4];

#define PASSES 16
int main(int argc, char **argv)
{
	struct kvec kvec[16];
	struct iov_iter i;
	int len;
	unsigned int clocks[PASSES];
	__wsum csum[PASSES] = {};
	unsigned int pass;
	unsigned int frag_len;

	read_cpu_cycles();
	clocks[0] = read_cpu_cycles();

	frag_len = argv[1] ? atoi(argv[1]) : 0;
	if (!frag_len || frag_len > sizeof data)
		frag_len = sizeof data;

	for (pass = 1; pass < PASSES; pass++) {
		/* Sum the same data 16 times */
		i.count = frag_len * 16;
		i.nr_segs = 16;
		i.kvec = kvec;
		i.iov_offset = 0;

		for (len = 0; len < 16; len++) {
			kvec[len].iov_len = frag_len;
			kvec[len].iov_base = data;
		}
		csum_and_copy_from_iter(target, i.count, csum + pass, &i);
		clocks[pass] = read_cpu_cycles();
	}
	for (pass = 1; pass < PASSES; pass++) {
		unsigned int delta = clocks[pass] - clocks[pass - 1];
		printf("pass %d: length %d, csum %x, clocks %d, clocks/word %5f\n",
			pass, frag_len * 16, fold(csum[pass]), delta, delta / (frag_len * 16/8 + 0.0));
	}

	return 0;
}

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-23 13:54                 ` David Laight
@ 2020-07-23 14:30                   ` David Laight
  2020-07-23 14:53                   ` Al Viro
  1 sibling, 0 replies; 102+ messages in thread
From: David Laight @ 2020-07-23 14:30 UTC (permalink / raw)
  To: 'Al Viro'
  Cc: 'Linus Torvalds', 'linux-kernel@vger.kernel.org',
	'linux-arch@vger.kernel.org'

> I had to replace the ror32() with __builtin_bswap32().
> The kernel object do contain the 'ror' instruction - even though I
> didn't find the asm for it.

Looking at some instruction timings ror32() and bswap32()
seem to need one of the same execution ports.
However on Intel cpus bswap64() takes 2 clocks but the ror64()
instructions only take 1.

AMD cpus are more symmetric and run all variant in 1 clock.

So ror32() is probably preferable in case anyone copies the
code into somewhere with 64bit checksum value.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-23 13:54                 ` David Laight
  2020-07-23 14:30                   ` David Laight
@ 2020-07-23 14:53                   ` Al Viro
  2020-07-23 15:19                     ` David Laight
  2020-07-23 15:21                     ` Al Viro
  1 sibling, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-23 14:53 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Thu, Jul 23, 2020 at 01:54:47PM +0000, David Laight wrote:
> From: Al Viro
> > Sent: 22 July 2020 18:39
> > I would love to see your patch, anyway, along with the testcases and performance
> > comparison.
> 
> See attached program.
> Compile and run (as root): csum_iov 1
> 
> Unpatched (as shipped) 16 vectors of 1 byte take ~430 clocks on my haswell cpu.
> With dsl_patch defined they take ~393.
> 
> The maximum throughput is ~1.16 clocks/word for 16 vectors of 1k.
> For longer vectors the data gets lost from the cache between the iterations.
> 
> On an older Ivy Bridge cpu it never goes faster than 2 clocks/word.
> (Due to the implementation of ADC.)
> 
> The absolute limit is 1 clock/word - limited by the memory write.
> I suspect that is achievable on Haswell with much less loop unrolling.
> 
> I had to replace the ror32() with __builtin_bswap32().
> The kernel object do contain the 'ror' instruction - even though I
> didn't find the asm for it.

First of all,
;  git grep -n -w ror32|grep '\.h:'
include/linux/bitops.h:109: * ror32 - rotate a 32-bit value right
include/linux/bitops.h:113:static inline __u32 ror32(__u32 word, unsigned int shift)
include/net/checksum.h:81:              sum = ror32(sum, 8);
; grep -A3 ror32 include/linux/bitops.h 
 * ror32 - rotate a 32-bit value right
 * @word: value to rotate
 * @shift: bits to roll
 */
static inline __u32 ror32(__u32 word, unsigned int shift)
{
        return (word >> (shift & 31)) | (word << ((-shift) & 31));
}
; cat >/tmp/a.c <<'EOF'
unsigned f(unsigned n)
{
        return (n >> 8) | (n << 24);
}
EOF
; gcc -c -O2 /tmp/a.c -o /tmp/a.o
; objdump /tmp/a.o
/tmp/a.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <f>:
   0:   89 f8                   mov    %edi,%eax
   2:   c1 c8 08                ror    $0x8,%eax
   5:   c3                      retq   
;
which ought to cover _that_ question.  Takes a couple of minutes, but that's
a trivial side issue.

Said that, what you've printed for 1-byte segments (and that's going to be
seriously affected by the setup costs in csum-copy.S, sensitive to calling
convention changes) is time to run the 16-iteration loop divided by 1 * 16 / 8;
IOW, your difference for 16 iterations here is 37*2 = 74 cycles.  With
per-iteration diff being a bit under 5 cycles.  Which is not implausible,
but
	1) extrapolating to other compiler versions, flags, etc. is not obvious
	2) the effects of calling convention changes need to be taken into account
	3) for copying to/from userland the effects of calling convention changes
are be even larger, and kernel is certainly not going to issue kvec iters of _that_
sort, TYVM.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-23 14:53                   ` Al Viro
@ 2020-07-23 15:19                     ` David Laight
  2020-07-23 15:21                     ` Al Viro
  1 sibling, 0 replies; 102+ messages in thread
From: David Laight @ 2020-07-23 15:19 UTC (permalink / raw)
  To: 'Al Viro'; +Cc: Linus Torvalds, linux-kernel, linux-arch

From: Al Viro
> Sent: 23 July 2020 15:54
> On Thu, Jul 23, 2020 at 01:54:47PM +0000, David Laight wrote:
> > From: Al Viro
> > > Sent: 22 July 2020 18:39
> > > I would love to see your patch, anyway, along with the testcases and performance
> > > comparison.
> >
> > See attached program.
> > Compile and run (as root): csum_iov 1
> >
> > Unpatched (as shipped) 16 vectors of 1 byte take ~430 clocks on my haswell cpu.
> > With dsl_patch defined they take ~393.
> >
> > The maximum throughput is ~1.16 clocks/word for 16 vectors of 1k.
> > For longer vectors the data gets lost from the cache between the iterations.
> >
> > On an older Ivy Bridge cpu it never goes faster than 2 clocks/word.
> > (Due to the implementation of ADC.)
> >
> > The absolute limit is 1 clock/word - limited by the memory write.
> > I suspect that is achievable on Haswell with much less loop unrolling.
> >
> > I had to replace the ror32() with __builtin_bswap32().
> > The kernel object do contain the 'ror' instruction - even though I
> > didn't find the asm for it.
> 
> First of all,
...
> static inline __u32 ror32(__u32 word, unsigned int shift)
> {
>         return (word >> (shift & 31)) | (word << ((-shift) & 31));
> }
> ; cat >/tmp/a.c <<'EOF'
...
> which ought to cover _that_ question.  Takes a couple of minutes, but that's
> a trivial side issue.

I did find that function. Typing __builtin_bswap32() only took seconds.

> Said that, what you've printed for 1-byte segments (and that's going to be
> seriously affected by the setup costs in csum-copy.S, sensitive to calling
> convention changes) is time to run the 16-iteration loop divided by 1 * 16 / 8;
> IOW, your difference for 16 iterations here is 37*2 = 74 cycles.  With
> per-iteration diff being a bit under 5 cycles.  Which is not implausible,
> but
> 	1) extrapolating to other compiler versions, flags, etc. is not obvious
> 	2) the effects of calling convention changes need to be taken into account
> 	3) for copying to/from userland the effects of calling convention changes
> are be even larger, and kernel is certainly not going to issue kvec iters of _that_
> sort, TYVM.

Agreed, I used 1 byte fragments to make changes to that particular
code fragment stand out.
Running the program with different sizes shows just how much the
code around the inner loop costs.
It isn't as though buffers are a nice multiple of 64 bytes.

For x86_64 the user/kernel calling conventions are much the same.
Most modern ones pass a few arguments in registers so passing
the old 'csum' in is probably ok.
It may even save a register spill to stack.
The extra two arguments for saving the fail address are horrid.
As is passing the csum by address.

For efficiency you do want:
	csum = csum_copy(dest, src, length, csum);

And it does make sense to use 0 for 'error'.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-23 14:53                   ` Al Viro
  2020-07-23 15:19                     ` David Laight
@ 2020-07-23 15:21                     ` Al Viro
  2020-07-23 15:36                       ` David Laight
  1 sibling, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-23 15:21 UTC (permalink / raw)
  To: David Laight; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Thu, Jul 23, 2020 at 03:53:42PM +0100, Al Viro wrote:

> Said that, what you've printed for 1-byte segments (and that's going to be
> seriously affected by the setup costs in csum-copy.S, sensitive to calling
> convention changes) is time to run the 16-iteration loop divided by 1 * 16 / 8;
> IOW, your difference for 16 iterations here is 37*2 = 74 cycles.  With
> per-iteration diff being a bit under 5 cycles.  Which is not implausible,
> but
> 	1) extrapolating to other compiler versions, flags, etc. is not obvious
> 	2) the effects of calling convention changes need to be taken into account
> 	3) for copying to/from userland the effects of calling convention changes
> are be even larger, and kernel is certainly not going to issue kvec iters of _that_
> sort, TYVM.

To clarify it a bit: the effects of calling conventions change are mostly due
to not passing (and saving) those error pointers, and that could be had with
"pass the initial sum in" - just start these iov_iter.c loops with sum = ~0U
and we get the same warranties re not getting 0 in absence of faults.

The point is, your "~4.5 cycles per vector" is pretty much noise and the
difference between the 3-argument and 4-argument variants could easily be
in the same range.  It might be a valid microoptimization, it might be not.
3-argument variant is simpler and IMO in absence of strong data we ought
to go with that.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* RE: [PATCH 04/18] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-23 15:21                     ` Al Viro
@ 2020-07-23 15:36                       ` David Laight
  0 siblings, 0 replies; 102+ messages in thread
From: David Laight @ 2020-07-23 15:36 UTC (permalink / raw)
  To: 'Al Viro'; +Cc: Linus Torvalds, linux-kernel, linux-arch

From: Al Viro
> Sent: 23 July 2020 16:21
...
> The point is, your "~4.5 cycles per vector" is pretty much noise and the
> difference between the 3-argument and 4-argument variants could easily be
> in the same range.  It might be a valid microoptimization, it might be not.
> 3-argument variant is simpler and IMO in absence of strong data we ought
> to go with that.

There is definitely more to be gained by rewriting the x86-86 asm.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [RFC][CFT][PATCHSET v2] saner calling conventions for csum-and-copy primitives
  2020-07-21 20:24 [RFC][CFT][PATCHSET] saner calling conventions for csum-and-copy primitives Al Viro
  2020-07-21 20:25 ` [PATCH 01/18] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
@ 2020-07-24  1:25 ` Al Viro
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
  1 sibling, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-arch, linux-kernel

Updated version pushed.  Changes since the first variant:

	* xtensa fix from Max Filippov (regression from last cycle, will
send to Linus tonight)
	* csum_partial_copy_nocheck() got its default variants consolidated
	* sparc64 idiotic typo fixed (along with the broken build script
I'd been using)
	* commit messages updated (mostly in "saner calling conventions for
csum_and_copy_..._user()")

The branch is still based at 5.8-rc1 and can be found in
git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git work.csum_and_copy
Individual patches will go in followups.
 
Shortlog:
Al Viro (19):
      skb_copy_and_csum_bits(): don't bother with the last argument
      icmp_push_reply(): reorder adding the checksum up
      unify generic instances of csum_partial_copy_nocheck()
      csum_partial_copy_nocheck(): drop the last argument
      csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
      saner calling conventions for csum_and_copy_..._user()
      alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
      arm: propagate the calling convention changes down to csum_partial_copy_from_user()
      m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
      sh: propage the calling conventions change down to csum_partial_copy_generic()
      i386: propagate the calling conventions change down to csum_partial_copy_generic()
      sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
      mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
      mips: __csum_partial_copy_kernel() has no users left
      mips: propagate the calling convention change down into __csum_partial_copy_..._user()
      xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
      sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
      amd64: switch csum_partial_copy_generic() to new calling conventions
      ppc: propagate the calling conventions change down to csum_partial_copy_generic()

Max Filippov (1):
      xtensa: fix access check in csum_and_copy_from_user

Diffstat:
 arch/alpha/include/asm/checksum.h         |   5 +-
 arch/alpha/lib/csum_partial_copy.c        | 164 ++++++++-----------
 arch/arm/include/asm/checksum.h           |  17 +-
 arch/arm/lib/csumpartialcopy.S            |   4 +-
 arch/arm/lib/csumpartialcopygeneric.S     |   1 +
 arch/arm/lib/csumpartialcopyuser.S        |  26 +--
 arch/c6x/include/asm/checksum.h           |   6 +
 arch/hexagon/include/asm/checksum.h       |  11 --
 arch/hexagon/lib/checksum.c               |  11 --
 arch/ia64/include/asm/checksum.h          |   3 -
 arch/ia64/lib/csum_partial_copy.c         |  15 --
 arch/m68k/include/asm/checksum.h          |   7 +-
 arch/m68k/lib/checksum.c                  |  88 +++-------
 arch/mips/include/asm/checksum.h          |  68 ++------
 arch/mips/lib/csum_partial.S              | 261 ++++++++++--------------------
 arch/nios2/include/asm/checksum.h         |   5 -
 arch/parisc/include/asm/checksum.h        |  28 ----
 arch/parisc/lib/checksum.c                |  17 --
 arch/powerpc/include/asm/checksum.h       |  13 +-
 arch/powerpc/lib/checksum_32.S            |  74 ++++-----
 arch/powerpc/lib/checksum_64.S            |  37 ++---
 arch/powerpc/lib/checksum_wrappers.c      |  74 ++-------
 arch/s390/include/asm/checksum.h          |   7 -
 arch/sh/include/asm/checksum_32.h         |  36 ++---
 arch/sh/lib/checksum.S                    | 119 ++++----------
 arch/sparc/include/asm/checksum.h         |   2 +
 arch/sparc/include/asm/checksum_32.h      |  70 ++------
 arch/sparc/include/asm/checksum_64.h      |  39 +----
 arch/sparc/lib/checksum_32.S              | 202 +++++------------------
 arch/sparc/lib/csum_copy.S                |   3 +-
 arch/sparc/lib/csum_copy_from_user.S      |   4 +-
 arch/sparc/lib/csum_copy_to_user.S        |   4 +-
 arch/sparc/mm/fault_32.c                  |   6 +-
 arch/x86/include/asm/checksum.h           |   1 +
 arch/x86/include/asm/checksum_32.h        |  40 ++---
 arch/x86/include/asm/checksum_64.h        |  14 +-
 arch/x86/lib/checksum_32.S                | 117 +++++---------
 arch/x86/lib/csum-copy_64.S               | 140 +++++++++-------
 arch/x86/lib/csum-wrappers_64.c           |  86 ++--------
 arch/x86/um/asm/checksum.h                |  16 --
 arch/x86/um/asm/checksum_32.h             |  23 ---
 arch/xtensa/include/asm/checksum.h        |  34 ++--
 arch/xtensa/lib/checksum.S                |  67 ++------
 drivers/net/ethernet/3com/typhoon.c       |   3 +-
 drivers/net/ethernet/sun/sunvnet_common.c |   2 +-
 include/asm-generic/checksum.h            |  14 --
 include/linux/skbuff.h                    |   2 +-
 include/net/checksum.h                    |  22 ++-
 lib/checksum.c                            |  11 --
 lib/iov_iter.c                            |  21 ++-
 net/core/skbuff.c                         |  13 +-
 net/ipv4/icmp.c                           |  10 +-
 net/ipv4/ip_output.c                      |   6 +-
 net/ipv4/raw.c                            |   2 +-
 net/ipv6/icmp.c                           |   4 +-
 net/ipv6/ip6_output.c                     |   2 +-
 net/ipv6/raw.c                            |   2 +-
 net/sunrpc/socklib.c                      |   2 +-
 58 files changed, 615 insertions(+), 1466 deletions(-)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user
  2020-07-24  1:25 ` [RFC][CFT][PATCHSET v2] saner calling conventions for csum-and-copy primitives Al Viro
@ 2020-07-24  1:25   ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 02/20] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
                       ` (18 more replies)
  0 siblings, 19 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Max Filippov <jcmvbkbc@gmail.com>

Commit d341659f470b ("xtensa: switch to providing
csum_and_copy_from_user()") introduced access check, but incorrectly
tested dst instead of src.
Fix access_ok argument in csum_and_copy_from_user.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: d341659f470b ("xtensa: switch to providing csum_and_copy_from_user()")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/xtensa/include/asm/checksum.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index d8292cc9ebdf..243a5fe79d3c 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -57,7 +57,7 @@ static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 				   int len, __wsum sum, int *err_ptr)
 {
-	if (access_ok(dst, len))
+	if (access_ok(src, len))
 		return csum_partial_copy_generic((__force const void *)src, dst,
 					len, sum, err_ptr, NULL);
 	if (len)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 02/20] skb_copy_and_csum_bits(): don't bother with the last argument
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 03/20] icmp_push_reply(): reorder adding the checksum up Al Viro
                       ` (17 subsequent siblings)
  18 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

it's always 0

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/ethernet/sun/sunvnet_common.c |  2 +-
 include/linux/skbuff.h                    |  2 +-
 net/core/skbuff.c                         | 11 ++++++-----
 net/ipv4/icmp.c                           |  2 +-
 net/ipv4/ip_output.c                      |  4 ++--
 net/ipv6/icmp.c                           |  4 ++--
 net/ipv6/ip6_output.c                     |  2 +-
 net/sunrpc/socklib.c                      |  2 +-
 8 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet_common.c b/drivers/net/ethernet/sun/sunvnet_common.c
index 8dc6c9ff22e1..80fde5f06fce 100644
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -1168,7 +1168,7 @@ static inline struct sk_buff *vnet_skb_shape(struct sk_buff *skb, int ncookies)
 			*(__sum16 *)(skb->data + offset) = 0;
 			csum = skb_copy_and_csum_bits(skb, start,
 						      nskb->data + start,
-						      skb->len - start, 0);
+						      skb->len - start);
 
 			/* add in the header checksums */
 			if (skb->protocol == htons(ETH_P_IP)) {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0c0377fc00c2..1dcd255c9a03 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3529,7 +3529,7 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags);
 int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len);
 int skb_store_bits(struct sk_buff *skb, int offset, const void *from, int len);
 __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset, u8 *to,
-			      int len, __wsum csum);
+			      int len);
 int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset,
 		    struct pipe_inode_info *pipe, unsigned int len,
 		    unsigned int flags);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b8afefe6f6b6..9c0918651445 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2723,19 +2723,20 @@ EXPORT_SYMBOL(skb_checksum);
 /* Both of above in one bottle. */
 
 __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
-				    u8 *to, int len, __wsum csum)
+				    u8 *to, int len)
 {
 	int start = skb_headlen(skb);
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 	int pos = 0;
+	__wsum csum = 0;
 
 	/* Copy header. */
 	if (copy > 0) {
 		if (copy > len)
 			copy = len;
 		csum = csum_partial_copy_nocheck(skb->data + offset, to,
-						 copy, csum);
+						 copy, 0);
 		if ((len -= copy) == 0)
 			return csum;
 		offset += copy;
@@ -2791,7 +2792,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 				copy = len;
 			csum2 = skb_copy_and_csum_bits(frag_iter,
 						       offset - start,
-						       to, copy, 0);
+						       to, copy);
 			csum = csum_block_add(csum, csum2, pos);
 			if ((len -= copy) == 0)
 				return csum;
@@ -3011,7 +3012,7 @@ void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to)
 	csum = 0;
 	if (csstart != skb->len)
 		csum = skb_copy_and_csum_bits(skb, csstart, to + csstart,
-					      skb->len - csstart, 0);
+					      skb->len - csstart);
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		long csstuff = csstart + skb->csum_offset;
@@ -3933,7 +3934,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 					skb_copy_and_csum_bits(head_skb, offset,
 							       skb_put(nskb,
 								       len),
-							       len, 0);
+							       len);
 				SKB_GSO_CB(nskb)->csum_start =
 					skb_headroom(nskb) + doffset;
 			} else {
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 956a806649f7..62d7a2bfc9a3 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -352,7 +352,7 @@ static int icmp_glue_bits(void *from, char *to, int offset, int len, int odd,
 
 	csum = skb_copy_and_csum_bits(icmp_param->skb,
 				      icmp_param->offset + offset,
-				      to, len, 0);
+				      to, len);
 
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	if (icmp_pointers[icmp_param->data.icmph.type].error)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 090d3097ee15..7fd164754519 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1120,7 +1120,7 @@ static int __ip_append_data(struct sock *sk,
 			if (fraggap) {
 				skb->csum = skb_copy_and_csum_bits(
 					skb_prev, maxfraglen,
-					data + transhdrlen, fraggap, 0);
+					data + transhdrlen, fraggap);
 				skb_prev->csum = csum_sub(skb_prev->csum,
 							  skb->csum);
 				data += fraggap;
@@ -1405,7 +1405,7 @@ ssize_t	ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
 				skb->csum = skb_copy_and_csum_bits(skb_prev,
 								   maxfraglen,
 						    skb_transport_header(skb),
-								   fraggap, 0);
+								   fraggap);
 				skb_prev->csum = csum_sub(skb_prev->csum,
 							  skb->csum);
 				pskb_trim_unique(skb_prev, maxfraglen);
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index fc5000370030..2ae42b4e0c1a 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -314,10 +314,10 @@ static int icmpv6_getfrag(void *from, char *to, int offset, int len, int odd, st
 {
 	struct icmpv6_msg *msg = (struct icmpv6_msg *) from;
 	struct sk_buff *org_skb = msg->skb;
-	__wsum csum = 0;
+	__wsum csum;
 
 	csum = skb_copy_and_csum_bits(org_skb, msg->offset + offset,
-				      to, len, csum);
+				      to, len);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	if (!(msg->type & ICMPV6_INFOMSG_MASK))
 		nf_ct_attach(skb, org_skb);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 8a8c2d0cfcc8..bf9367c2504b 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1613,7 +1613,7 @@ static int __ip6_append_data(struct sock *sk,
 			if (fraggap) {
 				skb->csum = skb_copy_and_csum_bits(
 					skb_prev, maxfraglen,
-					data + transhdrlen, fraggap, 0);
+					data + transhdrlen, fraggap);
 				skb_prev->csum = csum_sub(skb_prev->csum,
 							  skb->csum);
 				data += fraggap;
diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c
index 3fc8af8bb961..d52313af82bc 100644
--- a/net/sunrpc/socklib.c
+++ b/net/sunrpc/socklib.c
@@ -70,7 +70,7 @@ static size_t xdr_skb_read_and_csum_bits(struct xdr_skb_reader *desc, void *to,
 	if (len > desc->count)
 		len = desc->count;
 	pos = desc->offset;
-	csum2 = skb_copy_and_csum_bits(desc->skb, pos, to, len, 0);
+	csum2 = skb_copy_and_csum_bits(desc->skb, pos, to, len);
 	desc->csum = csum_block_add(desc->csum, csum2, pos);
 	desc->count -= len;
 	desc->offset += len;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 03/20] icmp_push_reply(): reorder adding the checksum up
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
  2020-07-24  1:25     ` [PATCH v2 02/20] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck() Al Viro
                       ` (16 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

do csum_partial_copy_nocheck() on the first fragment, then
add the rest to it.  Equivalent transformation.

That was the only caller of csum_partial_copy_nocheck() that
might pass it non-zero as the last argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv4/icmp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 62d7a2bfc9a3..f93317157549 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -376,15 +376,15 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 		ip_flush_pending_frames(sk);
 	} else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) {
 		struct icmphdr *icmph = icmp_hdr(skb);
-		__wsum csum = 0;
+		__wsum csum;
 		struct sk_buff *skb1;
 
+		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
+						 (char *)icmph,
+						 icmp_param->head_len, 0);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
-		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
-						 (char *)icmph,
-						 icmp_param->head_len, csum);
 		icmph->checksum = csum_fold(csum);
 		skb->ip_summed = CHECKSUM_NONE;
 		ip_push_pending_frames(sk, fl4);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 03/20] icmp_push_reply(): reorder adding the checksum up
  2020-07-24  1:25     ` [PATCH v2 03/20] icmp_push_reply(): reorder adding the checksum up Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

do csum_partial_copy_nocheck() on the first fragment, then
add the rest to it.  Equivalent transformation.

That was the only caller of csum_partial_copy_nocheck() that
might pass it non-zero as the last argument.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv4/icmp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 62d7a2bfc9a3..f93317157549 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -376,15 +376,15 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 		ip_flush_pending_frames(sk);
 	} else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) {
 		struct icmphdr *icmph = icmp_hdr(skb);
-		__wsum csum = 0;
+		__wsum csum;
 		struct sk_buff *skb1;
 
+		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
+						 (char *)icmph,
+						 icmp_param->head_len, 0);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
-		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
-						 (char *)icmph,
-						 icmp_param->head_len, csum);
 		icmph->checksum = csum_fold(csum);
 		skb->ip_summed = CHECKSUM_NONE;
 		ip_push_pending_frames(sk, fl4);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
  2020-07-24  1:25     ` [PATCH v2 02/20] skb_copy_and_csum_bits(): don't bother with the last argument Al Viro
  2020-07-24  1:25     ` [PATCH v2 03/20] icmp_push_reply(): reorder adding the checksum up Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  6:41       ` Christoph Hellwig
  2020-07-24  1:25     ` [PATCH v2 05/20] csum_partial_copy_nocheck(): drop the last argument Al Viro
                       ` (15 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

quite a few architectures have the same csum_partial_copy_nocheck() -
simply memcpy() the data and then return the csum of the copy.

hexagon, parisc, ia64, s390, um: explicitly spelled out that way.

arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
out in lib/checksum.h (with varying amounts of perversions along the way).

everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
sh, sparc, x86, xtensa) have non-generic variants.  For all except c6x
the declaration is in their asm/checksum.h.  c6x uses the wrapper
from asm-generic/checksum.h that would normally lead to the lib/checksum.h
instance, but in case of c6x we end up using an asm function from arch/c6x
instead.

Screw that mess - have architectures with private instances define
_HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
*not* defined.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h   |  1 +
 arch/arm/include/asm/checksum.h     |  1 +
 arch/c6x/include/asm/checksum.h     |  6 ++++++
 arch/hexagon/include/asm/checksum.h | 11 -----------
 arch/hexagon/lib/checksum.c         | 11 -----------
 arch/ia64/include/asm/checksum.h    |  3 ---
 arch/ia64/lib/csum_partial_copy.c   | 15 ---------------
 arch/m68k/include/asm/checksum.h    |  1 +
 arch/mips/include/asm/checksum.h    |  2 +-
 arch/nios2/include/asm/checksum.h   |  5 -----
 arch/parisc/include/asm/checksum.h  |  8 --------
 arch/parisc/lib/checksum.c          | 17 -----------------
 arch/powerpc/include/asm/checksum.h |  1 +
 arch/s390/include/asm/checksum.h    |  7 -------
 arch/sh/include/asm/checksum_32.h   |  1 +
 arch/sparc/include/asm/checksum.h   |  1 +
 arch/x86/include/asm/checksum.h     |  1 +
 arch/x86/um/asm/checksum.h          | 16 ----------------
 arch/xtensa/include/asm/checksum.h  |  1 +
 include/asm-generic/checksum.h      | 14 --------------
 include/net/checksum.h              |  9 +++++++++
 lib/checksum.c                      | 11 -----------
 22 files changed, 24 insertions(+), 119 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 0eac81624d01..7e8e4fa4362d 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -42,6 +42,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define _HAVE_ARCH_CSUM_AND_COPY
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
 
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index ed6073fee338..53f769508540 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -41,6 +41,7 @@ __wsum
 csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define _HAVE_ARCH_CSUM_AND_COPY
 static inline
 __wsum csum_and_copy_from_user (const void __user *src, void *dst,
 				      int len, __wsum sum, int *err_ptr)
diff --git a/arch/c6x/include/asm/checksum.h b/arch/c6x/include/asm/checksum.h
index 36770b8308d9..facdd636af85 100644
--- a/arch/c6x/include/asm/checksum.h
+++ b/arch/c6x/include/asm/checksum.h
@@ -26,6 +26,12 @@ csum_tcpudp_nofold(__be32 saddr, __be32 daddr, __u32 len,
 }
 #define csum_tcpudp_nofold csum_tcpudp_nofold
 
+extern __wsum csum_partial_copy(const void *src, void *dst, int len, __wsum sum);
+
+#define _HAVE_ARCH_CSUM_AND_COPY
+#define csum_partial_copy_nocheck(src, dst, len, sum)  \
+	csum_partial_copy((src), (dst), (len), (sum))
+
 #include <asm-generic/checksum.h>
 
 #endif /* _ASM_C6X_CHECKSUM_H */
diff --git a/arch/hexagon/include/asm/checksum.h b/arch/hexagon/include/asm/checksum.h
index a5c42f4614c1..4bc6ad96c4c5 100644
--- a/arch/hexagon/include/asm/checksum.h
+++ b/arch/hexagon/include/asm/checksum.h
@@ -10,17 +10,6 @@
 unsigned int do_csum(const void *voidptr, int len);
 
 /*
- * the same as csum_partial, but copies from src while it
- * checksums
- *
- * here even more important to align src and dst on a 32-bit (or even
- * better 64-bit) boundary
- */
-#define csum_partial_copy_nocheck csum_partial_copy_nocheck
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
-
-/*
  * computes the checksum of the TCP/UDP pseudo-header
  * returns a 16-bit checksum, already complemented
  */
diff --git a/arch/hexagon/lib/checksum.c b/arch/hexagon/lib/checksum.c
index c4a6b72d97de..ba50822a0800 100644
--- a/arch/hexagon/lib/checksum.c
+++ b/arch/hexagon/lib/checksum.c
@@ -176,14 +176,3 @@ unsigned int do_csum(const void *voidptr, int len)
 
 	return 0xFFFF & sum0;
 }
-
-/*
- * copy from ds while checksumming, otherwise like csum_partial
- */
-__wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
-{
-	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
-}
-EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/ia64/include/asm/checksum.h b/arch/ia64/include/asm/checksum.h
index 2a1c64629cdc..f3026213aa32 100644
--- a/arch/ia64/include/asm/checksum.h
+++ b/arch/ia64/include/asm/checksum.h
@@ -37,9 +37,6 @@ extern __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr,
  */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum);
-
 /*
  * This routine is used for miscellaneous IP-like checksums, mainly in
  * icmp.c
diff --git a/arch/ia64/lib/csum_partial_copy.c b/arch/ia64/lib/csum_partial_copy.c
index 6e82e0be8040..917e3138b277 100644
--- a/arch/ia64/lib/csum_partial_copy.c
+++ b/arch/ia64/lib/csum_partial_copy.c
@@ -96,18 +96,3 @@ unsigned long do_csum_c(const unsigned char * buff, int len, unsigned int psum)
 out:
 	return result;
 }
-
-/*
- * XXX Fixme
- *
- * This is very ugly but temporary. THIS NEEDS SERIOUS ENHANCEMENTS.
- * But it's very tricky to get right even in C.
- */
-__wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
-{
-	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
-}
-
-EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index 3f2c15d6f18c..ab16881d84cb 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -31,6 +31,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define _HAVE_ARCH_CSUM_AND_COPY
 extern __wsum csum_and_copy_from_user(const void __user *src,
 						void *dst,
 						int len, __wsum sum,
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index dcebaaf8c862..b771621ec3c5 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -101,9 +101,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
  * the same as csum_partial, but copies from user space (but on MIPS
  * we have just one address space, so this is identical to the above)
  */
+#define _HAVE_ARCH_CSUM_AND_COPY
 __wsum csum_partial_copy_nocheck(const void *src, void *dst,
 				       int len, __wsum sum);
-#define csum_partial_copy_nocheck csum_partial_copy_nocheck
 
 /*
  *	Fold a partial checksum without adding pseudo headers
diff --git a/arch/nios2/include/asm/checksum.h b/arch/nios2/include/asm/checksum.h
index ec39698d3bea..69004e07a1ba 100644
--- a/arch/nios2/include/asm/checksum.h
+++ b/arch/nios2/include/asm/checksum.h
@@ -12,11 +12,6 @@
 
 /* Take these from lib/checksum.c */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
-extern __wsum csum_partial_copy(const void *src, void *dst, int len,
-				__wsum sum);
-#define csum_partial_copy_nocheck(src, dst, len, sum)	\
-	csum_partial_copy((src), (dst), (len), (sum))
-
 extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl);
 extern __sum16 ip_compute_csum(const void *buff, int len);
 
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index fe8c63b2d2c3..522cd574c068 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -19,14 +19,6 @@
 extern __wsum csum_partial(const void *, int, __wsum);
 
 /*
- * The same as csum_partial, but copies from src while it checksums.
- *
- * Here even more important to align src and dst on a 32-bit (or even
- * better 64-bit) boundary
- */
-extern __wsum csum_partial_copy_nocheck(const void *, void *, int, __wsum);
-
-/*
  *	Optimized for IP headers, which always checksum on 4 octet boundaries.
  *
  *	Written by Randolph Chung <tausq@debian.org>, and then mucked with by
diff --git a/arch/parisc/lib/checksum.c b/arch/parisc/lib/checksum.c
index c6f161583549..4818f3db84a5 100644
--- a/arch/parisc/lib/checksum.c
+++ b/arch/parisc/lib/checksum.c
@@ -106,20 +106,3 @@ __wsum csum_partial(const void *buff, int len, __wsum sum)
 }
 
 EXPORT_SYMBOL(csum_partial);
-
-/*
- * copy while checksumming, otherwise like csum_partial
- */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum)
-{
-	/*
-	 * It's 2:30 am and I don't feel like doing it real ...
-	 * This is lots slower than the real thing (tm)
-	 */
-	sum = csum_partial(src, len, sum);
-	memcpy(dst, src, len);
-
-	return sum;
-}
-EXPORT_SYMBOL(csum_partial_copy_nocheck);
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 9cce06194dcc..d75fc5bf8f37 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -29,6 +29,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len, __wsum sum, int *err_ptr);
 
+#define _HAVE_ARCH_CSUM_AND_COPY
 #define csum_partial_copy_nocheck(src, dst, len, sum)   \
         csum_partial_copy_generic((src), (dst), (len), (sum), NULL, NULL)
 
diff --git a/arch/s390/include/asm/checksum.h b/arch/s390/include/asm/checksum.h
index 6d01c96aeb5c..6813bfa1eeb7 100644
--- a/arch/s390/include/asm/checksum.h
+++ b/arch/s390/include/asm/checksum.h
@@ -39,13 +39,6 @@ csum_partial(const void *buff, int len, __wsum sum)
 	return sum;
 }
 
-static inline __wsum
-csum_partial_copy_nocheck (const void *src, void *dst, int len, __wsum sum)
-{
-        memcpy(dst,src,len);
-	return csum_partial(dst, len, sum);
-}
-
 /*
  *      Fold a partial checksum without adding pseudo headers
  */
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 91571a42e44e..87269642d78d 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -34,6 +34,7 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 					    int len, __wsum sum,
 					    int *src_err_ptr, int *dst_err_ptr);
 
+#define _HAVE_ARCH_CSUM_AND_COPY
 /*
  *	Note: when you get a NULL pointer exception here this means someone
  *	passed in an incorrect kernel address to one of these functions.
diff --git a/arch/sparc/include/asm/checksum.h b/arch/sparc/include/asm/checksum.h
index a6256cb6fc5c..deb4fe5aeafd 100644
--- a/arch/sparc/include/asm/checksum.h
+++ b/arch/sparc/include/asm/checksum.h
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #ifndef ___ASM_SPARC_CHECKSUM_H
 #define ___ASM_SPARC_CHECKSUM_H
+#define _HAVE_ARCH_CSUM_AND_COPY
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 #if defined(__sparc__) && defined(__arch64__)
 #include <asm/checksum_64.h>
diff --git a/arch/x86/include/asm/checksum.h b/arch/x86/include/asm/checksum.h
index 0ada98d5d09f..bca625a60186 100644
--- a/arch/x86/include/asm/checksum.h
+++ b/arch/x86/include/asm/checksum.h
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #define  _HAVE_ARCH_COPY_AND_CSUM_FROM_USER 1
 #define HAVE_CSUM_COPY_USER
+#define _HAVE_ARCH_CSUM_AND_COPY
 #ifdef CONFIG_X86_32
 # include <asm/checksum_32.h>
 #else
diff --git a/arch/x86/um/asm/checksum.h b/arch/x86/um/asm/checksum.h
index ff6bba2c8ab6..b07824500363 100644
--- a/arch/x86/um/asm/checksum.h
+++ b/arch/x86/um/asm/checksum.h
@@ -20,22 +20,6 @@
  */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-/*
- *	Note: when you get a NULL pointer exception here this means someone
- *	passed in an incorrect kernel address to one of these functions.
- *
- *	If you use these functions directly please don't forget the
- *	access_ok().
- */
-
-static __inline__
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum)
-{
-	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
-}
-
 /**
  * csum_fold - Fold and invert a 32bit checksum.
  * sum: 32bit unfolded sum
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 243a5fe79d3c..0c879099977f 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -41,6 +41,7 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 					    int len, __wsum sum,
 					    int *src_err_ptr, int *dst_err_ptr);
 
+#define _HAVE_ARCH_CSUM_AND_COPY
 /*
  *	Note: when you get a NULL pointer exception here this means someone
  *	passed in an incorrect kernel address to one of these functions.
diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h
index 5a80f8e54300..43e18db89c14 100644
--- a/include/asm-generic/checksum.h
+++ b/include/asm-generic/checksum.h
@@ -16,20 +16,6 @@
  */
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-/*
- * the same as csum_partial, but copies from src while it
- * checksums
- *
- * here even more important to align src and dst on a 32-bit (or even
- * better 64-bit) boundary
- */
-extern __wsum csum_partial_copy(const void *src, void *dst, int len, __wsum sum);
-
-#ifndef csum_partial_copy_nocheck
-#define csum_partial_copy_nocheck(src, dst, len, sum)	\
-	csum_partial_copy((src), (dst), (len), (sum))
-#endif
-
 #ifndef ip_fast_csum
 /*
  * This is a version of ip_compute_csum() optimized for IP headers,
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 46754ba9d7b7..db9d02b5f88a 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -47,6 +47,15 @@ static __inline__ __wsum csum_and_copy_to_user
 }
 #endif
 
+#ifndef _HAVE_ARCH_CSUM_AND_COPY
+static inline __wsum
+csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+{
+	memcpy(dst, src, len);
+	return csum_partial(dst, len, sum);
+}
+#endif
+
 #ifndef HAVE_ARCH_CSUM_ADD
 static inline __wsum csum_add(__wsum csum, __wsum addend)
 {
diff --git a/lib/checksum.c b/lib/checksum.c
index 7ac65a0000ff..6860d6b05a17 100644
--- a/lib/checksum.c
+++ b/lib/checksum.c
@@ -145,17 +145,6 @@ __sum16 ip_compute_csum(const void *buff, int len)
 }
 EXPORT_SYMBOL(ip_compute_csum);
 
-/*
- * copy from ds while checksumming, otherwise like csum_partial
- */
-__wsum
-csum_partial_copy(const void *src, void *dst, int len, __wsum sum)
-{
-	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
-}
-EXPORT_SYMBOL(csum_partial_copy);
-
 #ifndef csum_tcpudp_nofold
 static inline u32 from64to32(u64 x)
 {
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 05/20] csum_partial_copy_nocheck(): drop the last argument
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (2 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 06/20] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
                       ` (14 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

It's always 0.  Note that we could use ~0U as well - result
will be the same modulo 0xffff; later we'll make use of that
whenever convenient.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    | 2 +-
 arch/alpha/lib/csum_partial_copy.c   | 4 ++--
 arch/arm/include/asm/checksum.h      | 2 +-
 arch/arm/lib/csumpartialcopy.S       | 5 +++--
 arch/m68k/include/asm/checksum.h     | 3 +--
 arch/m68k/lib/checksum.c             | 3 ++-
 arch/mips/include/asm/checksum.h     | 7 +++++--
 arch/mips/lib/csum_partial.S         | 4 ++--
 arch/powerpc/include/asm/checksum.h  | 4 ++--
 arch/sh/include/asm/checksum_32.h    | 5 ++---
 arch/sparc/include/asm/checksum_32.h | 4 ++--
 arch/sparc/include/asm/checksum_64.h | 8 ++++++--
 arch/sparc/lib/csum_copy.S           | 2 +-
 arch/x86/include/asm/checksum_32.h   | 5 ++---
 arch/x86/include/asm/checksum_64.h   | 3 +--
 arch/x86/lib/csum-wrappers_64.c      | 4 ++--
 arch/xtensa/include/asm/checksum.h   | 5 ++---
 drivers/net/ethernet/3com/typhoon.c  | 3 +--
 include/net/checksum.h               | 4 ++--
 lib/iov_iter.c                       | 2 +-
 net/core/skbuff.c                    | 4 ++--
 net/ipv4/icmp.c                      | 2 +-
 net/ipv4/ip_output.c                 | 2 +-
 net/ipv4/raw.c                       | 2 +-
 net/ipv6/raw.c                       | 2 +-
 25 files changed, 47 insertions(+), 44 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 7e8e4fa4362d..84f9faea864a 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -45,7 +45,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_CSUM_AND_COPY
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
 
-__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 
 /*
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index af1dad74e933..f363dc89fcbe 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -372,13 +372,13 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	__wsum checksum;
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, sum, NULL);
+						dst, len, 0, NULL);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 53f769508540..7612b2bd4e9b 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -35,7 +35,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 __wsum
 csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
diff --git a/arch/arm/lib/csumpartialcopy.S b/arch/arm/lib/csumpartialcopy.S
index 184d97254a7a..aab914fbc86b 100644
--- a/arch/arm/lib/csumpartialcopy.S
+++ b/arch/arm/lib/csumpartialcopy.S
@@ -9,13 +9,14 @@
 
 		.text
 
-/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len, __u32 sum)
- * Params  : r0 = src, r1 = dst, r2 = len, r3 = checksum
+/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len)
+ * Params  : r0 = src, r1 = dst, r2 = len
  * Returns : r0 = new checksum
  */
 
 		.macro	save_regs
 		stmfd	sp!, {r1, r4 - r8, lr}
+		mov	r3, #0
 		.endm
 
 		.macro	load_regs
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index ab16881d84cb..d5e74c64b6cd 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -38,8 +38,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src,
 						int *csum_err);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
-					      void *dst, int len,
-					      __wsum sum);
+					      void *dst, int len);
 
 /*
  *	This is a version of ip_fast_csum() optimized for IP headers,
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 31797be9a3dc..86ddd2ee187d 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -324,9 +324,10 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	unsigned long tmp1, tmp2;
+	__wsum sum = 0;
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
 		"jeq 2f\n\t"
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index b771621ec3c5..63dfe08262b1 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -102,8 +102,11 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
  * we have just one address space, so this is identical to the above)
  */
 #define _HAVE_ARCH_CSUM_AND_COPY
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 
 /*
  *	Fold a partial checksum without adding pseudo headers
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 87fda0713b84..8d70855b0914 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -462,8 +462,8 @@ EXPORT_SYMBOL(csum_partial)
 	lw	errptr, 16(sp)
 #endif
 	.if \__nocheck == 1
-	FEXPORT(csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(csum_partial_copy_nocheck)
+	FEXPORT(__csum_partial_copy_nocheck)
+	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 	.endif
 	move	sum, zero
 	move	odd, zero
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index d75fc5bf8f37..64299785f639 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -30,8 +30,8 @@ extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len, __wsum sum, int *err_ptr);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
-#define csum_partial_copy_nocheck(src, dst, len, sum)   \
-        csum_partial_copy_generic((src), (dst), (len), (sum), NULL, NULL)
+#define csum_partial_copy_nocheck(src, dst, len)   \
+        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
 
 
 /*
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 87269642d78d..e8bf84d3b843 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -43,10 +43,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	access_ok().
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index 479a0b812af5..d21d114436ba 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -42,7 +42,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 unsigned int __csum_partial_copy_sparc_generic (const unsigned char *, unsigned char *);
 
 static inline __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	register unsigned int ret asm("o0") = (unsigned int)src;
 	register char *d asm("o1") = dst;
@@ -52,7 +52,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
 		"call __csum_partial_copy_sparc_generic\n\t"
 		" mov %6, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (sum)
+	: "0" (ret), "1" (d), "2" (l), "r" (0)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 0fa4433f5662..7aebdbe3ac96 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,8 +38,12 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 
 long __csum_partial_copy_from_user(const void __user *src,
 				   void *dst, int len,
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 26c644ba3ecb..72c900d21b12 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	csum_partial_copy_nocheck
+#define FUNC_NAME	__csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 11624c8a9d8d..137a3033edcc 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -38,10 +38,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	If you use these functions directly please don't forget the
  *	access_ok().
  */
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum)
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 0a289b87e872..5339f5dfc776 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -139,8 +139,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 					  int len, __wsum isum, int *errp);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 					int len, __wsum isum, int *errp);
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
+extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
  * ip_compute_csum - Compute an 16bit IP checksum.
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ee63d7576fd2..245f929a1c2c 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -129,9 +129,9 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
  * Returns an 32bit unfolded checksum of the buffer.
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 0c879099977f..dc09448935bf 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -47,10 +47,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	passed in an incorrect kernel address to one of these functions.
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index 5ed33c2c4742..00c2e7143555 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -1419,8 +1419,7 @@ typhoon_download_firmware(struct typhoon *tp)
 			 * the checksum, we can do this once, at the end.
 			 */
 			csum = csum_fold(csum_partial_copy_nocheck(image_data,
-								   dpage, len,
-								   0));
+								   dpage, len));
 
 			iowrite32(len, ioaddr + TYPHOON_REG_BOOT_LENGTH);
 			iowrite32(le16_to_cpu((__force __le16)csum),
diff --git a/include/net/checksum.h b/include/net/checksum.h
index db9d02b5f88a..1029191986e3 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -49,10 +49,10 @@ static __inline__ __wsum csum_and_copy_to_user
 
 #ifndef _HAVE_ARCH_CSUM_AND_COPY
 static inline __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 #endif
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index bf538c2bec77..7405922caaec 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -580,7 +580,7 @@ static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
 static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 			      __wsum sum, size_t off)
 {
-	__wsum next = csum_partial_copy_nocheck(from, to, len, 0);
+	__wsum next = csum_partial_copy_nocheck(from, to, len);
 	return csum_block_add(sum, next, off);
 }
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9c0918651445..6d51fb4312cd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2736,7 +2736,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 		if (copy > len)
 			copy = len;
 		csum = csum_partial_copy_nocheck(skb->data + offset, to,
-						 copy, 0);
+						 copy);
 		if ((len -= copy) == 0)
 			return csum;
 		offset += copy;
@@ -2766,7 +2766,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 				vaddr = kmap_atomic(p);
 				csum2 = csum_partial_copy_nocheck(vaddr + p_off,
 								  to + copied,
-								  p_len, 0);
+								  p_len);
 				kunmap_atomic(vaddr);
 				csum = csum_block_add(csum, csum2, pos);
 				pos += p_len;
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index f93317157549..47a46279ae4c 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -381,7 +381,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 
 		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
 						 (char *)icmph,
-						 icmp_param->head_len, 0);
+						 icmp_param->head_len);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 7fd164754519..f835136b8727 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1642,7 +1642,7 @@ static int ip_reply_glue_bits(void *dptr, char *to, int offset,
 {
 	__wsum csum;
 
-	csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0);
+	csum = csum_partial_copy_nocheck(dptr+offset, to, len);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	return 0;
 }
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 47665919048f..112f983f85fa 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->hdr.c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8ef5a7b30524..b1df7e5fb0a8 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -746,7 +746,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 05/20] csum_partial_copy_nocheck(): drop the last argument
  2020-07-24  1:25     ` [PATCH v2 05/20] csum_partial_copy_nocheck(): drop the last argument Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

It's always 0.  Note that we could use ~0U as well - result
will be the same modulo 0xffff; later we'll make use of that
whenever convenient.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    | 2 +-
 arch/alpha/lib/csum_partial_copy.c   | 4 ++--
 arch/arm/include/asm/checksum.h      | 2 +-
 arch/arm/lib/csumpartialcopy.S       | 5 +++--
 arch/m68k/include/asm/checksum.h     | 3 +--
 arch/m68k/lib/checksum.c             | 3 ++-
 arch/mips/include/asm/checksum.h     | 7 +++++--
 arch/mips/lib/csum_partial.S         | 4 ++--
 arch/powerpc/include/asm/checksum.h  | 4 ++--
 arch/sh/include/asm/checksum_32.h    | 5 ++---
 arch/sparc/include/asm/checksum_32.h | 4 ++--
 arch/sparc/include/asm/checksum_64.h | 8 ++++++--
 arch/sparc/lib/csum_copy.S           | 2 +-
 arch/x86/include/asm/checksum_32.h   | 5 ++---
 arch/x86/include/asm/checksum_64.h   | 3 +--
 arch/x86/lib/csum-wrappers_64.c      | 4 ++--
 arch/xtensa/include/asm/checksum.h   | 5 ++---
 drivers/net/ethernet/3com/typhoon.c  | 3 +--
 include/net/checksum.h               | 4 ++--
 lib/iov_iter.c                       | 2 +-
 net/core/skbuff.c                    | 4 ++--
 net/ipv4/icmp.c                      | 2 +-
 net/ipv4/ip_output.c                 | 2 +-
 net/ipv4/raw.c                       | 2 +-
 net/ipv6/raw.c                       | 2 +-
 25 files changed, 47 insertions(+), 44 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 7e8e4fa4362d..84f9faea864a 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -45,7 +45,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_CSUM_AND_COPY
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
 
-__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 
 /*
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index af1dad74e933..f363dc89fcbe 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -372,13 +372,13 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	__wsum checksum;
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, sum, NULL);
+						dst, len, 0, NULL);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 53f769508540..7612b2bd4e9b 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -35,7 +35,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 __wsum
 csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
diff --git a/arch/arm/lib/csumpartialcopy.S b/arch/arm/lib/csumpartialcopy.S
index 184d97254a7a..aab914fbc86b 100644
--- a/arch/arm/lib/csumpartialcopy.S
+++ b/arch/arm/lib/csumpartialcopy.S
@@ -9,13 +9,14 @@
 
 		.text
 
-/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len, __u32 sum)
- * Params  : r0 = src, r1 = dst, r2 = len, r3 = checksum
+/* Function: __u32 csum_partial_copy_nocheck(const char *src, char *dst, int len)
+ * Params  : r0 = src, r1 = dst, r2 = len
  * Returns : r0 = new checksum
  */
 
 		.macro	save_regs
 		stmfd	sp!, {r1, r4 - r8, lr}
+		mov	r3, #0
 		.endm
 
 		.macro	load_regs
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index ab16881d84cb..d5e74c64b6cd 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -38,8 +38,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src,
 						int *csum_err);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
-					      void *dst, int len,
-					      __wsum sum);
+					      void *dst, int len);
 
 /*
  *	This is a version of ip_fast_csum() optimized for IP headers,
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 31797be9a3dc..86ddd2ee187d 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -324,9 +324,10 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	unsigned long tmp1, tmp2;
+	__wsum sum = 0;
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
 		"jeq 2f\n\t"
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index b771621ec3c5..63dfe08262b1 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -102,8 +102,11 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
  * we have just one address space, so this is identical to the above)
  */
 #define _HAVE_ARCH_CSUM_AND_COPY
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				       int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 
 /*
  *	Fold a partial checksum without adding pseudo headers
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 87fda0713b84..8d70855b0914 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -462,8 +462,8 @@ EXPORT_SYMBOL(csum_partial)
 	lw	errptr, 16(sp)
 #endif
 	.if \__nocheck == 1
-	FEXPORT(csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(csum_partial_copy_nocheck)
+	FEXPORT(__csum_partial_copy_nocheck)
+	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 	.endif
 	move	sum, zero
 	move	odd, zero
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index d75fc5bf8f37..64299785f639 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -30,8 +30,8 @@ extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 				    int len, __wsum sum, int *err_ptr);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
-#define csum_partial_copy_nocheck(src, dst, len, sum)   \
-        csum_partial_copy_generic((src), (dst), (len), (sum), NULL, NULL)
+#define csum_partial_copy_nocheck(src, dst, len)   \
+        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
 
 
 /*
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 87269642d78d..e8bf84d3b843 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -43,10 +43,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	access_ok().
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index 479a0b812af5..d21d114436ba 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -42,7 +42,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 unsigned int __csum_partial_copy_sparc_generic (const unsigned char *, unsigned char *);
 
 static inline __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	register unsigned int ret asm("o0") = (unsigned int)src;
 	register char *d asm("o1") = dst;
@@ -52,7 +52,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
 		"call __csum_partial_copy_sparc_generic\n\t"
 		" mov %6, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (sum)
+	: "0" (ret), "1" (d), "2" (l), "r" (0)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 0fa4433f5662..7aebdbe3ac96 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,8 +38,12 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-				 int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
+{
+	return __csum_partial_copy_nocheck(src, dst, len, 0);
+}
 
 long __csum_partial_copy_from_user(const void __user *src,
 				   void *dst, int len,
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 26c644ba3ecb..72c900d21b12 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	csum_partial_copy_nocheck
+#define FUNC_NAME	__csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 11624c8a9d8d..137a3033edcc 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -38,10 +38,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	If you use these functions directly please don't forget the
  *	access_ok().
  */
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					       int len, __wsum sum)
+static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 0a289b87e872..5339f5dfc776 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -139,8 +139,7 @@ extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 					  int len, __wsum isum, int *errp);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 					int len, __wsum isum, int *errp);
-extern __wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum);
+extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
  * ip_compute_csum - Compute an 16bit IP checksum.
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ee63d7576fd2..245f929a1c2c 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -129,9 +129,9 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
  * Returns an 32bit unfolded checksum of the buffer.
  */
 __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 0c879099977f..dc09448935bf 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -47,10 +47,9 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  *	passed in an incorrect kernel address to one of these functions.
  */
 static inline
-__wsum csum_partial_copy_nocheck(const void *src, void *dst,
-					int len, __wsum sum)
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, sum, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
diff --git a/drivers/net/ethernet/3com/typhoon.c b/drivers/net/ethernet/3com/typhoon.c
index 5ed33c2c4742..00c2e7143555 100644
--- a/drivers/net/ethernet/3com/typhoon.c
+++ b/drivers/net/ethernet/3com/typhoon.c
@@ -1419,8 +1419,7 @@ typhoon_download_firmware(struct typhoon *tp)
 			 * the checksum, we can do this once, at the end.
 			 */
 			csum = csum_fold(csum_partial_copy_nocheck(image_data,
-								   dpage, len,
-								   0));
+								   dpage, len));
 
 			iowrite32(len, ioaddr + TYPHOON_REG_BOOT_LENGTH);
 			iowrite32(le16_to_cpu((__force __le16)csum),
diff --git a/include/net/checksum.h b/include/net/checksum.h
index db9d02b5f88a..1029191986e3 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -49,10 +49,10 @@ static __inline__ __wsum csum_and_copy_to_user
 
 #ifndef _HAVE_ARCH_CSUM_AND_COPY
 static inline __wsum
-csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum)
+csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
 	memcpy(dst, src, len);
-	return csum_partial(dst, len, sum);
+	return csum_partial(dst, len, 0);
 }
 #endif
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index bf538c2bec77..7405922caaec 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -580,7 +580,7 @@ static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
 static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 			      __wsum sum, size_t off)
 {
-	__wsum next = csum_partial_copy_nocheck(from, to, len, 0);
+	__wsum next = csum_partial_copy_nocheck(from, to, len);
 	return csum_block_add(sum, next, off);
 }
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9c0918651445..6d51fb4312cd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2736,7 +2736,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 		if (copy > len)
 			copy = len;
 		csum = csum_partial_copy_nocheck(skb->data + offset, to,
-						 copy, 0);
+						 copy);
 		if ((len -= copy) == 0)
 			return csum;
 		offset += copy;
@@ -2766,7 +2766,7 @@ __wsum skb_copy_and_csum_bits(const struct sk_buff *skb, int offset,
 				vaddr = kmap_atomic(p);
 				csum2 = csum_partial_copy_nocheck(vaddr + p_off,
 								  to + copied,
-								  p_len, 0);
+								  p_len);
 				kunmap_atomic(vaddr);
 				csum = csum_block_add(csum, csum2, pos);
 				pos += p_len;
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index f93317157549..47a46279ae4c 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -381,7 +381,7 @@ static void icmp_push_reply(struct icmp_bxm *icmp_param,
 
 		csum = csum_partial_copy_nocheck((void *)&icmp_param->data,
 						 (char *)icmph,
-						 icmp_param->head_len, 0);
+						 icmp_param->head_len);
 		skb_queue_walk(&sk->sk_write_queue, skb1) {
 			csum = csum_add(csum, skb1->csum);
 		}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 7fd164754519..f835136b8727 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1642,7 +1642,7 @@ static int ip_reply_glue_bits(void *dptr, char *to, int offset,
 {
 	__wsum csum;
 
-	csum = csum_partial_copy_nocheck(dptr+offset, to, len, 0);
+	csum = csum_partial_copy_nocheck(dptr+offset, to, len);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	return 0;
 }
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 47665919048f..112f983f85fa 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->hdr.c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8ef5a7b30524..b1df7e5fb0a8 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -746,7 +746,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 			skb->csum = csum_block_add(
 				skb->csum,
 				csum_partial_copy_nocheck(rfv->c + offset,
-							  to, copy, 0),
+							  to, copy),
 				odd);
 
 		odd = 0;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 06/20] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (3 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 05/20] csum_partial_copy_nocheck(): drop the last argument Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 07/20] saner calling conventions for csum_and_copy_..._user() Al Viro
                       ` (13 subsequent siblings)
  18 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Preparation for the change of calling conventions; right now all
callers pass 0 as initial sum.  Passing 0xffffffff instead yields
the values comparable mod 0xffff and guarantees that 0 will not
be returned on success.

Note that this relies upon the correct behaviour with arbitrary
initial sum and the above pretty much says "it's been untested
with anything other than 0".  Analysis is unpleasant, to put it
mildly, but the suckers _are_ handling it correctly, surprisingly
enough.  Perhaps not too surprisingly, since most of the instances
share the code with csum_partial_copy_nocheck(), which used to get
some testing due to icmp_push_reply()...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 lib/iov_iter.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7405922caaec..d5b7e204fea6 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1451,7 +1451,7 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, 0, &err);
+					       v.iov_len, ~0U, &err);
 		if (!err) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
@@ -1493,7 +1493,7 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, 0, &err);
+					       v.iov_len, ~0U, &err);
 		if (err)
 			return false;
 		sum = csum_block_add(sum, next, off);
@@ -1539,7 +1539,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, 0, &err);
+					     v.iov_len, ~0U, &err);
 		if (!err) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 07/20] saner calling conventions for csum_and_copy_..._user()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (4 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 06/20] csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 08/20] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers Al Viro
                       ` (12 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

All callers of these primitives will
	* discard anything we might've copied in case of error
	* ignore the csum value in case of error
	* always pass 0xffffffff as the initial sum, so the
resulting csum value (in case of success, that is) will never be 0.
	* always pass a positive length.

That suggest the following calling conventions:
	* don't pass err_ptr - just return 0 on error.
	* don't bother with zeroing destination, etc. in case of error
	* don't pass the initial sum - just use 0xffffffff.

This commit does the minimal conversion in the instances of csum_and_copy_...();
the changes of actual asm code behind them are done later in the series.
Note that this asm code is often shared with csum_partial_copy_nocheck();
the difference is that csum_partial_copy_nocheck() passes 0 for initial
sum while csum_and_copy_..._user() pass 0xffffffff.  Fortunately, we are
free to pass 0xffffffff in all cases and subsequent patches will use that
freedom without any special comments.

A part that could be split off: parisc and uml/i386 claimed to have
csum_and_copy_to_user() instances of their own, but those were identical
to the generic one, so we simply drop them.  Not sure if it's worth
a separate commit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    |  2 +-
 arch/alpha/lib/csum_partial_copy.c   | 25 ++++++-------
 arch/arm/include/asm/checksum.h      | 13 ++++---
 arch/m68k/include/asm/checksum.h     |  3 +-
 arch/m68k/lib/checksum.c             |  8 ++---
 arch/mips/include/asm/checksum.h     | 46 ++++++++++++------------
 arch/parisc/include/asm/checksum.h   | 20 -----------
 arch/powerpc/include/asm/checksum.h  |  4 +--
 arch/powerpc/lib/checksum_wrappers.c | 68 +++++++++++-------------------------
 arch/sh/include/asm/checksum_32.h    | 36 +++++++++----------
 arch/sparc/include/asm/checksum_32.h | 65 ++++++++++++++++------------------
 arch/sparc/include/asm/checksum_64.h | 14 ++++----
 arch/x86/include/asm/checksum_32.h   | 35 ++++++++-----------
 arch/x86/include/asm/checksum_64.h   |  6 ++--
 arch/x86/lib/csum-wrappers_64.c      | 38 +++++++++-----------
 arch/x86/um/asm/checksum_32.h        | 23 ------------
 arch/xtensa/include/asm/checksum.h   | 30 ++++++++--------
 include/net/checksum.h               | 15 ++++----
 lib/iov_iter.c                       | 19 +++++-----
 19 files changed, 183 insertions(+), 287 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 84f9faea864a..99d631e146b2 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -43,7 +43,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 #define _HAVE_ARCH_CSUM_AND_COPY
-__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index f363dc89fcbe..3c0e89c39ddb 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -325,30 +325,27 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 }
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			       __wsum sum, int *errp)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = (__force u32) sum;
+	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
+	int err = 0;
 
 	if (len) {
-		if (!access_ok(src, len)) {
-			if (errp) *errp = -EFAULT;
-			memset(dst, 0, len);
-			return sum;
-		}
+		if (!access_ok(src, len))
+			return 0;
 		if (!doff) {
 			if (!soff)
 				checksum = csum_partial_cfu_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					len-8, checksum, errp);
+					len-8, checksum, &err);
 			else
 				checksum = csum_partial_cfu_dest_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					soff, len-8, checksum, errp);
+					soff, len-8, checksum, &err);
 		} else {
 			unsigned long partial_dest;
 			ldq_u(partial_dest, dst);
@@ -357,15 +354,15 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 			else
 				checksum = csum_partial_cfu_unaligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					soff, doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 		}
-		checksum = from64to16 (checksum);
+		checksum = err ? 0 : from64to16 (checksum);
 	}
 	return (__force __wsum)checksum;
 }
@@ -378,7 +375,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, 0, NULL);
+						dst, len);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 7612b2bd4e9b..1601c132b064 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -43,16 +43,15 @@ csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum s
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 #define _HAVE_ARCH_CSUM_AND_COPY
 static inline
-__wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
+	int err = 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(src, len))
+		return 0;
 
-	return sum;
+	sum = csum_partial_copy_from_user(src, dst, len, ~0U, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index d5e74c64b6cd..692e7b6cc042 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -34,8 +34,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_CSUM_AND_COPY
 extern __wsum csum_and_copy_from_user(const void __user *src,
 						void *dst,
-						int len, __wsum sum,
-						int *csum_err);
+						int len);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
 					      void *dst, int len);
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 86ddd2ee187d..3aeca261f622 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -129,8 +129,7 @@ EXPORT_SYMBOL(csum_partial);
  */
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum sum, int *csum_err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
 	/*
 	 * GCC doesn't like more than 10 operands for the asm
@@ -138,6 +137,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 	 * code.
 	 */
 	unsigned long tmp1, tmp2;
+	__wsum sum = ~0U;
 
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
@@ -311,9 +311,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	*csum_err = tmp2;
-
-	return(sum);
+	return tmp2 ? 0 : sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 63dfe08262b1..b882cacea3ee 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -60,16 +60,15 @@ __wsum csum_partial_copy_from_user(const void __user *src, void *dst, int len,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum,
-						   err_ptr);
-	if (len)
-		*err_ptr = -EFAULT;
+	__wsum sum = ~0U;
+	int err = 0;
 
-	return sum;
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_from_user(src, dst, len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
@@ -77,24 +76,23 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
  */
 #define HAVE_CSUM_COPY_USER
 static inline
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
+	int err = 0;
+	__wsum sum = ~0U;
+
 	might_fault();
-	if (access_ok(dst, len)) {
-		if (uaccess_kernel())
-			return __csum_partial_copy_kernel(src,
-							  (__force void *)dst,
-							  len, sum, err_ptr);
-		else
-			return __csum_partial_copy_to_user(src,
-							   (__force void *)dst,
-							   len, sum, err_ptr);
-	}
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	if (!access_ok(dst, len))
+		return 0;
+	if (uaccess_kernel())
+		sum = __csum_partial_copy_kernel(src,
+						  (__force void *)dst,
+						  len, sum, &err);
+	else
+		sum = __csum_partial_copy_to_user(src,
+						   (__force void *)dst,
+						   len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index 522cd574c068..3c43baca7b39 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -173,25 +173,5 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/* 
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						      void __user *dst,
-						      int len, __wsum sum,
-						      int *err_ptr)
-{
-	/* code stolen from include/asm-mips64 */
-	sum = csum_partial(src, len, sum);
-	 
-	if (copy_to_user(dst, src, len)) {
-		*err_ptr = -EFAULT;
-		return (__force __wsum)-1;
-	}
-
-	return sum;
-}
-
 #endif
 
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 64299785f639..dba685d984c0 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -24,10 +24,10 @@ extern __wsum csum_partial_copy_generic(const void *src, void *dst,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr);
+				      int len);
 #define HAVE_CSUM_COPY_USER
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-				    int len, __wsum sum, int *err_ptr);
+				    int len);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 #define csum_partial_copy_nocheck(src, dst, len)   \
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index fabe4db28726..b1faa82dd8af 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -12,82 +12,56 @@
 #include <linux/uaccess.h>
 
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+			       int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_read_from_user(src, len);
-
-	*err_ptr = 0;
 
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(src, len))) {
-		*err_ptr = -EFAULT;
-		csum = (__force unsigned int)sum;
-		goto out;
-	}
+	allow_read_from_user(src, len);
 
 	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, sum, err_ptr, NULL);
+					 len, ~0U, &err, NULL);
 
-	if (unlikely(*err_ptr)) {
+	if (unlikely(err)) {
 		int missing = __copy_from_user(dst, src, len);
 
-		if (missing) {
-			memset(dst + len - missing, 0, missing);
-			*err_ptr = -EFAULT;
-		} else {
-			*err_ptr = 0;
-		}
-
-		csum = csum_partial(dst, len, sum);
+		if (missing)
+			csum = 0;
+		else
+			csum = csum_partial(dst, len, ~0U);
 	}
 
-out:
 	prevent_read_from_user(src, len);
 	return (__force __wsum)csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_write_to_user(dst, len);
-
-	*err_ptr = 0;
-
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(dst, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(dst, len))) {
-		*err_ptr = -EFAULT;
-		csum = -1; /* invalid checksum */
-		goto out;
-	}
+	allow_write_to_user(dst, len);
 
 	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, sum, NULL, err_ptr);
+					 len, ~0U, NULL, &err);
 
-	if (unlikely(*err_ptr)) {
-		csum = csum_partial(src, len, sum);
+	if (unlikely(err)) {
+		csum = csum_partial(src, len, ~0U);
 
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			csum = -1; /* invalid checksum */
-		}
+		if (copy_to_user(dst, src, len))
+			csum = 0;
 	}
 
-out:
 	prevent_write_to_user(dst, len);
 	return (__force __wsum)csum;
 }
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index e8bf84d3b843..a08e8eb924ed 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -50,15 +50,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, sum, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -199,16 +200,15 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 #define HAVE_CSUM_COPY_USER
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(dst, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src,
+						dst, len, sum, NULL, &err);
+	return err ? 0 : sum;
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d21d114436ba..b5873b7b7bf0 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -60,19 +60,16 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 }
 
 static inline __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			    __wsum sum, int *err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
   {
 	register unsigned long ret asm("o0") = (unsigned long)src;
 	register char *d asm("o1") = dst;
 	register int l asm("g1") = len;
-	register __wsum s asm("g7") = sum;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-	if (unlikely(!access_ok(src, len))) {
-		if (len)
-			*err = -EFAULT;
-		return sum;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
 	__asm__ __volatile__ (
 	".section __ex_table,#alloc\n\t"
@@ -83,42 +80,40 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 	"call __csum_partial_copy_sparc_generic\n\t"
 	" st %8, [%%sp + 64]\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
 	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
 	  "cc", "memory");
-	return (__force __wsum)ret;
+	return err ? 0 : (__force __wsum)ret;
 }
 
 #define HAVE_CSUM_COPY_USER
 
 static inline __wsum
-csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			  __wsum sum, int *err)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	if (!access_ok(dst, len)) {
-		*err = -EFAULT;
-		return sum;
-	} else {
-		register unsigned long ret asm("o0") = (unsigned long)src;
-		register char __user *d asm("o1") = dst;
-		register int l asm("g1") = len;
-		register __wsum s asm("g7") = sum;
+	register unsigned long ret asm("o0") = (unsigned long)src;
+	register char __user *d asm("o1") = dst;
+	register int l asm("g1") = len;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-		__asm__ __volatile__ (
-		".section __ex_table,#alloc\n\t"
-		".align 4\n\t"
-		".word 1f,1\n\t"
-		".previous\n"
-		"1:\n\t"
-		"call __csum_partial_copy_sparc_generic\n\t"
-		" st %8, [%%sp + 64]\n"
-		: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-		: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
-		: "o2", "o3", "o4", "o5", "o7",
-		  "g2", "g3", "g4", "g5",
-		  "cc", "memory");
-		return (__force __wsum)ret;
-	}
+	if (!access_ok(dst, len))
+		return 0;
+
+	__asm__ __volatile__ (
+	".section __ex_table,#alloc\n\t"
+	".align 4\n\t"
+	".word 1f,1\n\t"
+	".previous\n"
+	"1:\n\t"
+	"call __csum_partial_copy_sparc_generic\n\t"
+	" st %8, [%%sp + 64]\n"
+	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
+	: "o2", "o3", "o4", "o5", "o7",
+	  "g2", "g3", "g4", "g5",
+	  "cc", "memory");
+	return err ? 0 : (__force __wsum)ret;
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 7aebdbe3ac96..4d0bbff43e62 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -51,12 +51,11 @@ long __csum_partial_copy_from_user(const void __user *src,
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len,
-			    __wsum sum, int *err)
+			    void *dst, int len)
 {
-	long ret = __csum_partial_copy_from_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
@@ -70,12 +69,11 @@ long __csum_partial_copy_to_user(const void *src,
 
 static inline __wsum
 csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len,
-		      __wsum sum, int *err)
+		      void __user *dst, int len)
 {
-	long ret = __csum_partial_copy_to_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 137a3033edcc..5948cde9e4ad 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -44,22 +44,19 @@ static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int l
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
-					     void *dst, int len,
-					     __wsum sum, int *err_ptr)
+					     void *dst, int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (!user_access_begin(src, len)) {
-		if (len)
-			*err_ptr = -EFAULT;
-		return sum;
-	}
+	if (!user_access_begin(src, len))
+		return 0;
 	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, sum, err_ptr, NULL);
+					len, ~0U, &err, NULL);
 	user_access_end();
 
-	return ret;
+	return err ? 0 : ret;
 }
 
 /*
@@ -177,23 +174,19 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (user_access_begin(dst, len)) {
-		ret = csum_partial_copy_generic(src, (__force void *)dst,
-						len, sum, NULL, err_ptr);
-		user_access_end();
-		return ret;
-	}
+	if (!user_access_begin(dst, len))
+		return 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	ret = csum_partial_copy_generic(src, (__force void *)dst,
+					len, ~0U, NULL, &err);
+	user_access_end();
+	return err ? 0 : ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 5339f5dfc776..9af3aed54c6b 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -135,10 +135,8 @@ extern __visible __wsum csum_partial_copy_generic(const void *src, const void *d
 					int *src_err_ptr, int *dst_err_ptr);
 
 
-extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-					  int len, __wsum isum, int *errp);
-extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-					int len, __wsum isum, int *errp);
+extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index 245f929a1c2c..ae2fb87e2274 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -22,13 +22,15 @@
  */
 __wsum
 csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum isum, int *errp)
+			    int len)
 {
+	int err = 0;
+	__wsum isum = ~0U;
+
 	might_sleep();
-	*errp = 0;
 
 	if (!user_access_begin(src, len))
-		goto out_err;
+		return 0;
 
 	/*
 	 * Why 6, not 7? To handle odd addresses aligned we
@@ -53,20 +55,15 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		}
 	}
 	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, errp, NULL);
+				dst, len, isum, &err, NULL);
 	user_access_end();
-	if (unlikely(*errp))
-		goto out_err;
-
+	if (unlikely(err))
+		isum = 0;
 	return isum;
 
 out:
 	user_access_end();
-out_err:
-	*errp = -EFAULT;
-	memset(dst, 0, len);
-
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -83,16 +80,15 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 __wsum
 csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len, __wsum isum, int *errp)
+			  int len)
 {
-	__wsum ret;
+	__wsum ret, isum = ~0U;
+	int err = 0;
 
 	might_sleep();
 
-	if (!user_access_begin(dst, len)) {
-		*errp = -EFAULT;
+	if (!user_access_begin(dst, len))
 		return 0;
-	}
 
 	if (unlikely((unsigned long)dst & 6)) {
 		while (((unsigned long)dst & 6) && len >= 2) {
@@ -107,15 +103,13 @@ csum_and_copy_to_user(const void *src, void __user *dst,
 		}
 	}
 
-	*errp = 0;
 	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, errp);
+					len, isum, NULL, &err);
 	user_access_end();
-	return ret;
+	return err ? 0 : ret;
 out:
 	user_access_end();
-	*errp = -EFAULT;
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
diff --git a/arch/x86/um/asm/checksum_32.h b/arch/x86/um/asm/checksum_32.h
index b9ac7c9eb72c..0b13c2947ad1 100644
--- a/arch/x86/um/asm/checksum_32.h
+++ b/arch/x86/um/asm/checksum_32.h
@@ -35,27 +35,4 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						     void __user *dst,
-						     int len, __wsum sum, int *err_ptr)
-{
-	if (access_ok(dst, len)) {
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			return (__force __wsum)-1;
-		}
-
-		return csum_partial(src, len, sum);
-	}
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
-}
-
 #endif
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index dc09448935bf..fe78fba7bd64 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -55,14 +55,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+				   int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+
+	if (!access_ok(src, len))
+		return 0;
+
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, ~0U, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -243,15 +245,15 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 #define HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
-					       void __user *dst, int len,
-					       __wsum sum, int *err_ptr)
+					       void __user *dst, int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic(src,dst,len,sum,NULL,err_ptr);
+	int err = 0;
+	__wsum sum = ~0U;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(dst, len))
+		return 0;
 
-	return (__force __wsum)-1; /* invalid checksum */
+	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
+	return err ? 0 : sum;
 }
 #endif
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 1029191986e3..0d05b9e8690b 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -24,26 +24,23 @@
 #ifndef _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+				      int len)
 {
 	if (copy_from_user(dst, src, len))
-		*err_ptr = -EFAULT;
-	return csum_partial(dst, len, sum);
+		return 0;
+	return csum_partial(dst, len, ~0U);
 }
 #endif
 
 #ifndef HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user
-(const void *src, void __user *dst, int len, __wsum sum, int *err_ptr)
+(const void *src, void __user *dst, int len)
 {
-	sum = csum_partial(src, len, sum);
+	__wsum sum = csum_partial(src, len, ~0U);
 
 	if (copy_to_user(dst, src, len) == 0)
 		return sum;
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	return 0;
 }
 #endif
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d5b7e204fea6..eccb0fe5a498 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1448,15 +1448,14 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (!err) {
+					       v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy((to += v.bv_len) - v.bv_len,
@@ -1490,11 +1489,10 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 	if (unlikely(i->count < bytes))
 		return false;
 	iterate_all_kinds(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (err)
+					       v.iov_len);
+		if (!next)
 			return false;
 		sum = csum_block_add(sum, next, off);
 		off += v.iov_len;
@@ -1536,15 +1534,14 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, ~0U, &err);
-		if (!err) {
+					     v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy(p + v.bv_offset,
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 07/20] saner calling conventions for csum_and_copy_..._user()
  2020-07-24  1:25     ` [PATCH v2 07/20] saner calling conventions for csum_and_copy_..._user() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

All callers of these primitives will
	* discard anything we might've copied in case of error
	* ignore the csum value in case of error
	* always pass 0xffffffff as the initial sum, so the
resulting csum value (in case of success, that is) will never be 0.
	* always pass a positive length.

That suggest the following calling conventions:
	* don't pass err_ptr - just return 0 on error.
	* don't bother with zeroing destination, etc. in case of error
	* don't pass the initial sum - just use 0xffffffff.

This commit does the minimal conversion in the instances of csum_and_copy_...();
the changes of actual asm code behind them are done later in the series.
Note that this asm code is often shared with csum_partial_copy_nocheck();
the difference is that csum_partial_copy_nocheck() passes 0 for initial
sum while csum_and_copy_..._user() pass 0xffffffff.  Fortunately, we are
free to pass 0xffffffff in all cases and subsequent patches will use that
freedom without any special comments.

A part that could be split off: parisc and uml/i386 claimed to have
csum_and_copy_to_user() instances of their own, but those were identical
to the generic one, so we simply drop them.  Not sure if it's worth
a separate commit...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/include/asm/checksum.h    |  2 +-
 arch/alpha/lib/csum_partial_copy.c   | 25 ++++++-------
 arch/arm/include/asm/checksum.h      | 13 ++++---
 arch/m68k/include/asm/checksum.h     |  3 +-
 arch/m68k/lib/checksum.c             |  8 ++---
 arch/mips/include/asm/checksum.h     | 46 ++++++++++++------------
 arch/parisc/include/asm/checksum.h   | 20 -----------
 arch/powerpc/include/asm/checksum.h  |  4 +--
 arch/powerpc/lib/checksum_wrappers.c | 68 +++++++++++-------------------------
 arch/sh/include/asm/checksum_32.h    | 36 +++++++++----------
 arch/sparc/include/asm/checksum_32.h | 65 ++++++++++++++++------------------
 arch/sparc/include/asm/checksum_64.h | 14 ++++----
 arch/x86/include/asm/checksum_32.h   | 35 ++++++++-----------
 arch/x86/include/asm/checksum_64.h   |  6 ++--
 arch/x86/lib/csum-wrappers_64.c      | 38 +++++++++-----------
 arch/x86/um/asm/checksum_32.h        | 23 ------------
 arch/xtensa/include/asm/checksum.h   | 30 ++++++++--------
 include/net/checksum.h               | 15 ++++----
 lib/iov_iter.c                       | 19 +++++-----
 19 files changed, 183 insertions(+), 287 deletions(-)

diff --git a/arch/alpha/include/asm/checksum.h b/arch/alpha/include/asm/checksum.h
index 84f9faea864a..99d631e146b2 100644
--- a/arch/alpha/include/asm/checksum.h
+++ b/arch/alpha/include/asm/checksum.h
@@ -43,7 +43,7 @@ extern __wsum csum_partial(const void *buff, int len, __wsum sum);
  */
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 #define _HAVE_ARCH_CSUM_AND_COPY
-__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *errp);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index f363dc89fcbe..3c0e89c39ddb 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -325,30 +325,27 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 }
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			       __wsum sum, int *errp)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = (__force u32) sum;
+	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
+	int err = 0;
 
 	if (len) {
-		if (!access_ok(src, len)) {
-			if (errp) *errp = -EFAULT;
-			memset(dst, 0, len);
-			return sum;
-		}
+		if (!access_ok(src, len))
+			return 0;
 		if (!doff) {
 			if (!soff)
 				checksum = csum_partial_cfu_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					len-8, checksum, errp);
+					len-8, checksum, &err);
 			else
 				checksum = csum_partial_cfu_dest_aligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
-					soff, len-8, checksum, errp);
+					soff, len-8, checksum, &err);
 		} else {
 			unsigned long partial_dest;
 			ldq_u(partial_dest, dst);
@@ -357,15 +354,15 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 			else
 				checksum = csum_partial_cfu_unaligned(
 					(const unsigned long __user *) src,
 					(unsigned long *) dst,
 					soff, doff, len-8, checksum,
-					partial_dest, errp);
+					partial_dest, &err);
 		}
-		checksum = from64to16 (checksum);
+		checksum = err ? 0 : from64to16 (checksum);
 	}
 	return (__force __wsum)checksum;
 }
@@ -378,7 +375,7 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 	mm_segment_t oldfs = get_fs();
 	set_fs(KERNEL_DS);
 	checksum = csum_and_copy_from_user((__force const void __user *)src,
-						dst, len, 0, NULL);
+						dst, len);
 	set_fs(oldfs);
 	return checksum;
 }
diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 7612b2bd4e9b..1601c132b064 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -43,16 +43,15 @@ csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum s
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 #define _HAVE_ARCH_CSUM_AND_COPY
 static inline
-__wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
+	int err = 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(src, len))
+		return 0;
 
-	return sum;
+	sum = csum_partial_copy_from_user(src, dst, len, ~0U, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/m68k/include/asm/checksum.h b/arch/m68k/include/asm/checksum.h
index d5e74c64b6cd..692e7b6cc042 100644
--- a/arch/m68k/include/asm/checksum.h
+++ b/arch/m68k/include/asm/checksum.h
@@ -34,8 +34,7 @@ __wsum csum_partial(const void *buff, int len, __wsum sum);
 #define _HAVE_ARCH_CSUM_AND_COPY
 extern __wsum csum_and_copy_from_user(const void __user *src,
 						void *dst,
-						int len, __wsum sum,
-						int *csum_err);
+						int len);
 
 extern __wsum csum_partial_copy_nocheck(const void *src,
 					      void *dst, int len);
diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 86ddd2ee187d..3aeca261f622 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -129,8 +129,7 @@ EXPORT_SYMBOL(csum_partial);
  */
 
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum sum, int *csum_err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
 	/*
 	 * GCC doesn't like more than 10 operands for the asm
@@ -138,6 +137,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 	 * code.
 	 */
 	unsigned long tmp1, tmp2;
+	__wsum sum = ~0U;
 
 	__asm__("movel %2,%4\n\t"
 		"btst #1,%4\n\t"	/* Check alignment */
@@ -311,9 +311,7 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	*csum_err = tmp2;
-
-	return(sum);
+	return tmp2 ? 0 : sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 63dfe08262b1..b882cacea3ee 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -60,16 +60,15 @@ __wsum csum_partial_copy_from_user(const void __user *src, void *dst, int len,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum,
-						   err_ptr);
-	if (len)
-		*err_ptr = -EFAULT;
+	__wsum sum = ~0U;
+	int err = 0;
 
-	return sum;
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_from_user(src, dst, len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
@@ -77,24 +76,23 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
  */
 #define HAVE_CSUM_COPY_USER
 static inline
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
+	int err = 0;
+	__wsum sum = ~0U;
+
 	might_fault();
-	if (access_ok(dst, len)) {
-		if (uaccess_kernel())
-			return __csum_partial_copy_kernel(src,
-							  (__force void *)dst,
-							  len, sum, err_ptr);
-		else
-			return __csum_partial_copy_to_user(src,
-							   (__force void *)dst,
-							   len, sum, err_ptr);
-	}
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	if (!access_ok(dst, len))
+		return 0;
+	if (uaccess_kernel())
+		sum = __csum_partial_copy_kernel(src,
+						  (__force void *)dst,
+						  len, sum, &err);
+	else
+		sum = __csum_partial_copy_to_user(src,
+						   (__force void *)dst,
+						   len, sum, &err);
+	return err ? 0 : sum;
 }
 
 /*
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index 522cd574c068..3c43baca7b39 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -173,25 +173,5 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/* 
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						      void __user *dst,
-						      int len, __wsum sum,
-						      int *err_ptr)
-{
-	/* code stolen from include/asm-mips64 */
-	sum = csum_partial(src, len, sum);
-	 
-	if (copy_to_user(dst, src, len)) {
-		*err_ptr = -EFAULT;
-		return (__force __wsum)-1;
-	}
-
-	return sum;
-}
-
 #endif
 
diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index 64299785f639..dba685d984c0 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -24,10 +24,10 @@ extern __wsum csum_partial_copy_generic(const void *src, void *dst,
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr);
+				      int len);
 #define HAVE_CSUM_COPY_USER
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-				    int len, __wsum sum, int *err_ptr);
+				    int len);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 #define csum_partial_copy_nocheck(src, dst, len)   \
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index fabe4db28726..b1faa82dd8af 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -12,82 +12,56 @@
 #include <linux/uaccess.h>
 
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-			       int len, __wsum sum, int *err_ptr)
+			       int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_read_from_user(src, len);
-
-	*err_ptr = 0;
 
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(src, len))) {
-		*err_ptr = -EFAULT;
-		csum = (__force unsigned int)sum;
-		goto out;
-	}
+	allow_read_from_user(src, len);
 
 	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, sum, err_ptr, NULL);
+					 len, ~0U, &err, NULL);
 
-	if (unlikely(*err_ptr)) {
+	if (unlikely(err)) {
 		int missing = __copy_from_user(dst, src, len);
 
-		if (missing) {
-			memset(dst + len - missing, 0, missing);
-			*err_ptr = -EFAULT;
-		} else {
-			*err_ptr = 0;
-		}
-
-		csum = csum_partial(dst, len, sum);
+		if (missing)
+			csum = 0;
+		else
+			csum = csum_partial(dst, len, ~0U);
 	}
 
-out:
 	prevent_read_from_user(src, len);
 	return (__force __wsum)csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
-__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			     __wsum sum, int *err_ptr)
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
 	unsigned int csum;
+	int err = 0;
 
 	might_sleep();
-	allow_write_to_user(dst, len);
-
-	*err_ptr = 0;
-
-	if (!len) {
-		csum = 0;
-		goto out;
-	}
+	if (unlikely(!access_ok(dst, len)))
+		return 0;
 
-	if (unlikely((len < 0) || !access_ok(dst, len))) {
-		*err_ptr = -EFAULT;
-		csum = -1; /* invalid checksum */
-		goto out;
-	}
+	allow_write_to_user(dst, len);
 
 	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, sum, NULL, err_ptr);
+					 len, ~0U, NULL, &err);
 
-	if (unlikely(*err_ptr)) {
-		csum = csum_partial(src, len, sum);
+	if (unlikely(err)) {
+		csum = csum_partial(src, len, ~0U);
 
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			csum = -1; /* invalid checksum */
-		}
+		if (copy_to_user(dst, src, len))
+			csum = 0;
 	}
 
-out:
 	prevent_write_to_user(dst, len);
 	return (__force __wsum)csum;
 }
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index e8bf84d3b843..a08e8eb924ed 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -50,15 +50,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
-__wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(src, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, sum, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -199,16 +200,15 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 #define HAVE_CSUM_COPY_USER
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	int err = 0;
+	__wsum sum = ~0U;
+
+	if (!access_ok(dst, len))
+		return 0;
+	sum = csum_partial_copy_generic((__force const void *)src,
+						dst, len, sum, NULL, &err);
+	return err ? 0 : sum;
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d21d114436ba..b5873b7b7bf0 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -60,19 +60,16 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 }
 
 static inline __wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len,
-			    __wsum sum, int *err)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
   {
 	register unsigned long ret asm("o0") = (unsigned long)src;
 	register char *d asm("o1") = dst;
 	register int l asm("g1") = len;
-	register __wsum s asm("g7") = sum;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-	if (unlikely(!access_ok(src, len))) {
-		if (len)
-			*err = -EFAULT;
-		return sum;
-	}
+	if (unlikely(!access_ok(src, len)))
+		return 0;
 
 	__asm__ __volatile__ (
 	".section __ex_table,#alloc\n\t"
@@ -83,42 +80,40 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len,
 	"call __csum_partial_copy_sparc_generic\n\t"
 	" st %8, [%%sp + 64]\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
 	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
 	  "cc", "memory");
-	return (__force __wsum)ret;
+	return err ? 0 : (__force __wsum)ret;
 }
 
 #define HAVE_CSUM_COPY_USER
 
 static inline __wsum
-csum_and_copy_to_user(const void *src, void __user *dst, int len,
-			  __wsum sum, int *err)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	if (!access_ok(dst, len)) {
-		*err = -EFAULT;
-		return sum;
-	} else {
-		register unsigned long ret asm("o0") = (unsigned long)src;
-		register char __user *d asm("o1") = dst;
-		register int l asm("g1") = len;
-		register __wsum s asm("g7") = sum;
+	register unsigned long ret asm("o0") = (unsigned long)src;
+	register char __user *d asm("o1") = dst;
+	register int l asm("g1") = len;
+	register __wsum s asm("g7") = ~0U;
+	int err = 0;
 
-		__asm__ __volatile__ (
-		".section __ex_table,#alloc\n\t"
-		".align 4\n\t"
-		".word 1f,1\n\t"
-		".previous\n"
-		"1:\n\t"
-		"call __csum_partial_copy_sparc_generic\n\t"
-		" st %8, [%%sp + 64]\n"
-		: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-		: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
-		: "o2", "o3", "o4", "o5", "o7",
-		  "g2", "g3", "g4", "g5",
-		  "cc", "memory");
-		return (__force __wsum)ret;
-	}
+	if (!access_ok(dst, len))
+		return 0;
+
+	__asm__ __volatile__ (
+	".section __ex_table,#alloc\n\t"
+	".align 4\n\t"
+	".word 1f,1\n\t"
+	".previous\n"
+	"1:\n\t"
+	"call __csum_partial_copy_sparc_generic\n\t"
+	" st %8, [%%sp + 64]\n"
+	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
+	: "o2", "o3", "o4", "o5", "o7",
+	  "g2", "g3", "g4", "g5",
+	  "cc", "memory");
+	return err ? 0 : (__force __wsum)ret;
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 7aebdbe3ac96..4d0bbff43e62 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -51,12 +51,11 @@ long __csum_partial_copy_from_user(const void __user *src,
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len,
-			    __wsum sum, int *err)
+			    void *dst, int len)
 {
-	long ret = __csum_partial_copy_from_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
@@ -70,12 +69,11 @@ long __csum_partial_copy_to_user(const void *src,
 
 static inline __wsum
 csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len,
-		      __wsum sum, int *err)
+		      void __user *dst, int len)
 {
-	long ret = __csum_partial_copy_to_user(src, dst, len, sum);
+	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
 	if (ret < 0)
-		*err = -EFAULT;
+		return 0;
 	return (__force __wsum) ret;
 }
 
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 137a3033edcc..5948cde9e4ad 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -44,22 +44,19 @@ static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int l
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
-					     void *dst, int len,
-					     __wsum sum, int *err_ptr)
+					     void *dst, int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (!user_access_begin(src, len)) {
-		if (len)
-			*err_ptr = -EFAULT;
-		return sum;
-	}
+	if (!user_access_begin(src, len))
+		return 0;
 	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, sum, err_ptr, NULL);
+					len, ~0U, &err, NULL);
 	user_access_end();
 
-	return ret;
+	return err ? 0 : ret;
 }
 
 /*
@@ -177,23 +174,19 @@ static inline __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
-					   int len, __wsum sum,
-					   int *err_ptr)
+					   int len)
 {
 	__wsum ret;
+	int err = 0;
 
 	might_sleep();
-	if (user_access_begin(dst, len)) {
-		ret = csum_partial_copy_generic(src, (__force void *)dst,
-						len, sum, NULL, err_ptr);
-		user_access_end();
-		return ret;
-	}
+	if (!user_access_begin(dst, len))
+		return 0;
 
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	ret = csum_partial_copy_generic(src, (__force void *)dst,
+					len, ~0U, NULL, &err);
+	user_access_end();
+	return err ? 0 : ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 5339f5dfc776..9af3aed54c6b 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -135,10 +135,8 @@ extern __visible __wsum csum_partial_copy_generic(const void *src, const void *d
 					int *src_err_ptr, int *dst_err_ptr);
 
 
-extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-					  int len, __wsum isum, int *errp);
-extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
-					int len, __wsum isum, int *errp);
+extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 extern __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 /**
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index 245f929a1c2c..ae2fb87e2274 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -22,13 +22,15 @@
  */
 __wsum
 csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len, __wsum isum, int *errp)
+			    int len)
 {
+	int err = 0;
+	__wsum isum = ~0U;
+
 	might_sleep();
-	*errp = 0;
 
 	if (!user_access_begin(src, len))
-		goto out_err;
+		return 0;
 
 	/*
 	 * Why 6, not 7? To handle odd addresses aligned we
@@ -53,20 +55,15 @@ csum_and_copy_from_user(const void __user *src, void *dst,
 		}
 	}
 	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, errp, NULL);
+				dst, len, isum, &err, NULL);
 	user_access_end();
-	if (unlikely(*errp))
-		goto out_err;
-
+	if (unlikely(err))
+		isum = 0;
 	return isum;
 
 out:
 	user_access_end();
-out_err:
-	*errp = -EFAULT;
-	memset(dst, 0, len);
-
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -83,16 +80,15 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  */
 __wsum
 csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len, __wsum isum, int *errp)
+			  int len)
 {
-	__wsum ret;
+	__wsum ret, isum = ~0U;
+	int err = 0;
 
 	might_sleep();
 
-	if (!user_access_begin(dst, len)) {
-		*errp = -EFAULT;
+	if (!user_access_begin(dst, len))
 		return 0;
-	}
 
 	if (unlikely((unsigned long)dst & 6)) {
 		while (((unsigned long)dst & 6) && len >= 2) {
@@ -107,15 +103,13 @@ csum_and_copy_to_user(const void *src, void __user *dst,
 		}
 	}
 
-	*errp = 0;
 	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, errp);
+					len, isum, NULL, &err);
 	user_access_end();
-	return ret;
+	return err ? 0 : ret;
 out:
 	user_access_end();
-	*errp = -EFAULT;
-	return isum;
+	return 0;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
diff --git a/arch/x86/um/asm/checksum_32.h b/arch/x86/um/asm/checksum_32.h
index b9ac7c9eb72c..0b13c2947ad1 100644
--- a/arch/x86/um/asm/checksum_32.h
+++ b/arch/x86/um/asm/checksum_32.h
@@ -35,27 +35,4 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 	return csum_fold(sum);
 }
 
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-static __inline__ __wsum csum_and_copy_to_user(const void *src,
-						     void __user *dst,
-						     int len, __wsum sum, int *err_ptr)
-{
-	if (access_ok(dst, len)) {
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			return (__force __wsum)-1;
-		}
-
-		return csum_partial(src, len, sum);
-	}
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
-}
-
 #endif
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index dc09448935bf..fe78fba7bd64 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -55,14 +55,16 @@ __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
-				   int len, __wsum sum, int *err_ptr)
+				   int len)
 {
-	if (access_ok(src, len))
-		return csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, err_ptr, NULL);
-	if (len)
-		*err_ptr = -EFAULT;
-	return sum;
+	int err = 0;
+
+	if (!access_ok(src, len))
+		return 0;
+
+	sum = csum_partial_copy_generic((__force const void *)src, dst,
+					len, ~0U, &err, NULL);
+	return err ? 0 : sum;
 }
 
 /*
@@ -243,15 +245,15 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
  */
 #define HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
-					       void __user *dst, int len,
-					       __wsum sum, int *err_ptr)
+					       void __user *dst, int len)
 {
-	if (access_ok(dst, len))
-		return csum_partial_copy_generic(src,dst,len,sum,NULL,err_ptr);
+	int err = 0;
+	__wsum sum = ~0U;
 
-	if (len)
-		*err_ptr = -EFAULT;
+	if (!access_ok(dst, len))
+		return 0;
 
-	return (__force __wsum)-1; /* invalid checksum */
+	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
+	return err ? 0 : sum;
 }
 #endif
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 1029191986e3..0d05b9e8690b 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -24,26 +24,23 @@
 #ifndef _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user (const void __user *src, void *dst,
-				      int len, __wsum sum, int *err_ptr)
+				      int len)
 {
 	if (copy_from_user(dst, src, len))
-		*err_ptr = -EFAULT;
-	return csum_partial(dst, len, sum);
+		return 0;
+	return csum_partial(dst, len, ~0U);
 }
 #endif
 
 #ifndef HAVE_CSUM_COPY_USER
 static __inline__ __wsum csum_and_copy_to_user
-(const void *src, void __user *dst, int len, __wsum sum, int *err_ptr)
+(const void *src, void __user *dst, int len)
 {
-	sum = csum_partial(src, len, sum);
+	__wsum sum = csum_partial(src, len, ~0U);
 
 	if (copy_to_user(dst, src, len) == 0)
 		return sum;
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	return 0;
 }
 #endif
 
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d5b7e204fea6..eccb0fe5a498 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1448,15 +1448,14 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (!err) {
+					       v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy((to += v.bv_len) - v.bv_len,
@@ -1490,11 +1489,10 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 	if (unlikely(i->count < bytes))
 		return false;
 	iterate_all_kinds(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
-					       v.iov_len, ~0U, &err);
-		if (err)
+					       v.iov_len);
+		if (!next)
 			return false;
 		sum = csum_block_add(sum, next, off);
 		off += v.iov_len;
@@ -1536,15 +1534,14 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 		return 0;
 	}
 	iterate_and_advance(i, bytes, v, ({
-		int err = 0;
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
-					     v.iov_len, ~0U, &err);
-		if (!err) {
+					     v.iov_len);
+		if (next) {
 			sum = csum_block_add(sum, next, off);
 			off += v.iov_len;
 		}
-		err ? v.iov_len : 0;
+		next ? 0 : v.iov_len;
 	}), ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy(p + v.bv_offset,
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 08/20] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (5 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 07/20] saner calling conventions for csum_and_copy_..._user() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 09/20] arm: propagate the calling convention changes down to csum_partial_copy_from_user() Al Viro
                       ` (11 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

get rid of set_fs() in csum_partial_copy_nocheck(), while we are at it -
just take the part of csum_and_copy_from_user() sans the access_ok() check
into a helper function and have csum_partial_copy_nocheck() call that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/lib/csum_partial_copy.c | 157 ++++++++++++++++---------------------
 1 file changed, 69 insertions(+), 88 deletions(-)

diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index 3c0e89c39ddb..dc68efbe9367 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -39,12 +39,11 @@ __asm__ __volatile__("insql %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 #define insqh(x,y,z) \
 __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 
-
-#define __get_user_u(x,ptr)				\
+#define __get_word(insn,x,ptr)				\
 ({							\
 	long __guu_err;					\
 	__asm__ __volatile__(				\
-	"1:	ldq_u %0,%2\n"				\
+	"1:	"#insn" %0,%2\n"			\
 	"2:\n"						\
 	EXC(1b,2b,%0,%1)				\
 		: "=r"(x), "=r"(__guu_err)		\
@@ -52,19 +51,6 @@ __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 	__guu_err;					\
 })
 
-#define __put_user_u(x,ptr)				\
-({							\
-	long __puu_err;					\
-	__asm__ __volatile__(				\
-	"1:	stq_u %2,%1\n"				\
-	"2:\n"						\
-	EXC(1b,2b,$31,%0)				\
-		: "=r"(__puu_err)			\
-		: "m"(__m(addr)), "rJ"(x), "0"(0));	\
-	__puu_err;					\
-})
-
-
 static inline unsigned short from64to16(unsigned long x)
 {
 	/* Using extract instructions is a bit more efficient
@@ -95,15 +81,15 @@ static inline unsigned short from64to16(unsigned long x)
  */
 static inline unsigned long
 csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
-			 long len, unsigned long checksum,
-			 int *errp)
+			 long len)
 {
+	unsigned long checksum = ~0U;
 	unsigned long carry = 0;
-	int err = 0;
 
 	while (len >= 0) {
 		unsigned long word;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		checksum += carry;
 		src++;
 		checksum += word;
@@ -116,7 +102,8 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 	checksum += carry;
 	if (len) {
 		unsigned long word, tmp;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		tmp = *dst;
 		mskql(word, len, word);
 		checksum += word;
@@ -125,7 +112,6 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -137,20 +123,21 @@ static inline unsigned long
 csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 			      unsigned long *dst,
 			      unsigned long soff,
-			      long len, unsigned long checksum,
-			      int *errp)
+			      long len)
 {
 	unsigned long first;
 	unsigned long word, carry;
 	unsigned long lastsrc = 7+len+(unsigned long)src;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first,src);
+	if (__get_word(ldq_u, first,src))
+		return 0;
 	carry = 0;
 	while (len >= 0) {
 		unsigned long second;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		len -= 8;
 		src++;
@@ -168,7 +155,8 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 	if (len) {
 		unsigned long tmp;
 		unsigned long second;
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		tmp = *dst;
 		extql(first, soff, word);
 		extqh(second, soff, first);
@@ -180,7 +168,6 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -191,18 +178,18 @@ static inline unsigned long
 csum_partial_cfu_src_aligned(const unsigned long __user *src,
 			     unsigned long *dst,
 			     unsigned long doff,
-			     long len, unsigned long checksum,
-			     unsigned long partial_dest,
-			     int *errp)
+			     long len,
+			     unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long word;
 	unsigned long second_dest;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		len -= 8;
 		insql(word, doff, second_dest);
 		checksum += carry;
@@ -216,7 +203,8 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	len += 8;
 	if (len) {
 		checksum += carry;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		mskql(word, len, word);
 		len -= 8;
 		checksum += word;
@@ -237,7 +225,6 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	stq_u(partial_dest | second_dest, dst);
 out:
 	checksum += carry;
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -249,23 +236,23 @@ static inline unsigned long
 csum_partial_cfu_unaligned(const unsigned long __user * src,
 			   unsigned long * dst,
 			   unsigned long soff, unsigned long doff,
-			   long len, unsigned long checksum,
-			   unsigned long partial_dest,
-			   int *errp)
+			   long len, unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long first;
 	unsigned long lastsrc;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first, src);
+	if (__get_word(ldq_u, first, src))
+		return 0;
 	lastsrc = 7+len+(unsigned long)src;
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		checksum += carry;
 		len -= 8;
@@ -286,7 +273,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -307,7 +295,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -320,63 +309,55 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		stq_u(partial_dest | word | second_dest, dst);
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
-__wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len)
+static __wsum __csum_and_copy(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
-	int err = 0;
-
-	if (len) {
-		if (!access_ok(src, len))
-			return 0;
-		if (!doff) {
-			if (!soff)
-				checksum = csum_partial_cfu_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					len-8, checksum, &err);
-			else
-				checksum = csum_partial_cfu_dest_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, len-8, checksum, &err);
-		} else {
-			unsigned long partial_dest;
-			ldq_u(partial_dest, dst);
-			if (!soff)
-				checksum = csum_partial_cfu_src_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					doff, len-8, checksum,
-					partial_dest, &err);
-			else
-				checksum = csum_partial_cfu_unaligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, doff, len-8, checksum,
-					partial_dest, &err);
-		}
-		checksum = err ? 0 : from64to16 (checksum);
+	unsigned long checksum;
+
+	if (!doff) {
+		if (!soff)
+			checksum = csum_partial_cfu_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst, len-8);
+		else
+			checksum = csum_partial_cfu_dest_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, len-8);
+	} else {
+		unsigned long partial_dest;
+		ldq_u(partial_dest, dst);
+		if (!soff)
+			checksum = csum_partial_cfu_src_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				doff, len-8, partial_dest);
+		else
+			checksum = csum_partial_cfu_unaligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, doff, len-8, partial_dest);
 	}
-	return (__force __wsum)checksum;
+	return (__force __wsum)from64to16 (checksum);
+}
+
+__wsum
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
+{
+	if (!access_ok(src, len))
+		return 0;
+	return __csum_and_copy(src, dst, len);
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	__wsum checksum;
-	mm_segment_t oldfs = get_fs();
-	set_fs(KERNEL_DS);
-	checksum = csum_and_copy_from_user((__force const void __user *)src,
+	return __csum_and_copy((__force const void __user *)src,
 						dst, len);
-	set_fs(oldfs);
-	return checksum;
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 08/20] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
  2020-07-24  1:25     ` [PATCH v2 08/20] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

get rid of set_fs() in csum_partial_copy_nocheck(), while we are at it -
just take the part of csum_and_copy_from_user() sans the access_ok() check
into a helper function and have csum_partial_copy_nocheck() call that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/lib/csum_partial_copy.c | 157 ++++++++++++++++---------------------
 1 file changed, 69 insertions(+), 88 deletions(-)

diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index 3c0e89c39ddb..dc68efbe9367 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -39,12 +39,11 @@ __asm__ __volatile__("insql %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 #define insqh(x,y,z) \
 __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 
-
-#define __get_user_u(x,ptr)				\
+#define __get_word(insn,x,ptr)				\
 ({							\
 	long __guu_err;					\
 	__asm__ __volatile__(				\
-	"1:	ldq_u %0,%2\n"				\
+	"1:	"#insn" %0,%2\n"			\
 	"2:\n"						\
 	EXC(1b,2b,%0,%1)				\
 		: "=r"(x), "=r"(__guu_err)		\
@@ -52,19 +51,6 @@ __asm__ __volatile__("insqh %1,%2,%0":"=r" (z):"r" (x),"r" (y))
 	__guu_err;					\
 })
 
-#define __put_user_u(x,ptr)				\
-({							\
-	long __puu_err;					\
-	__asm__ __volatile__(				\
-	"1:	stq_u %2,%1\n"				\
-	"2:\n"						\
-	EXC(1b,2b,$31,%0)				\
-		: "=r"(__puu_err)			\
-		: "m"(__m(addr)), "rJ"(x), "0"(0));	\
-	__puu_err;					\
-})
-
-
 static inline unsigned short from64to16(unsigned long x)
 {
 	/* Using extract instructions is a bit more efficient
@@ -95,15 +81,15 @@ static inline unsigned short from64to16(unsigned long x)
  */
 static inline unsigned long
 csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
-			 long len, unsigned long checksum,
-			 int *errp)
+			 long len)
 {
+	unsigned long checksum = ~0U;
 	unsigned long carry = 0;
-	int err = 0;
 
 	while (len >= 0) {
 		unsigned long word;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		checksum += carry;
 		src++;
 		checksum += word;
@@ -116,7 +102,8 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 	checksum += carry;
 	if (len) {
 		unsigned long word, tmp;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		tmp = *dst;
 		mskql(word, len, word);
 		checksum += word;
@@ -125,7 +112,6 @@ csum_partial_cfu_aligned(const unsigned long __user *src, unsigned long *dst,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -137,20 +123,21 @@ static inline unsigned long
 csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 			      unsigned long *dst,
 			      unsigned long soff,
-			      long len, unsigned long checksum,
-			      int *errp)
+			      long len)
 {
 	unsigned long first;
 	unsigned long word, carry;
 	unsigned long lastsrc = 7+len+(unsigned long)src;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first,src);
+	if (__get_word(ldq_u, first,src))
+		return 0;
 	carry = 0;
 	while (len >= 0) {
 		unsigned long second;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		len -= 8;
 		src++;
@@ -168,7 +155,8 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 	if (len) {
 		unsigned long tmp;
 		unsigned long second;
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		tmp = *dst;
 		extql(first, soff, word);
 		extqh(second, soff, first);
@@ -180,7 +168,6 @@ csum_partial_cfu_dest_aligned(const unsigned long __user *src,
 		*dst = word | tmp;
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -191,18 +178,18 @@ static inline unsigned long
 csum_partial_cfu_src_aligned(const unsigned long __user *src,
 			     unsigned long *dst,
 			     unsigned long doff,
-			     long len, unsigned long checksum,
-			     unsigned long partial_dest,
-			     int *errp)
+			     long len,
+			     unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long word;
 	unsigned long second_dest;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		len -= 8;
 		insql(word, doff, second_dest);
 		checksum += carry;
@@ -216,7 +203,8 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	len += 8;
 	if (len) {
 		checksum += carry;
-		err |= __get_user(word, src);
+		if (__get_word(ldq, word, src))
+			return 0;
 		mskql(word, len, word);
 		len -= 8;
 		checksum += word;
@@ -237,7 +225,6 @@ csum_partial_cfu_src_aligned(const unsigned long __user *src,
 	stq_u(partial_dest | second_dest, dst);
 out:
 	checksum += carry;
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
@@ -249,23 +236,23 @@ static inline unsigned long
 csum_partial_cfu_unaligned(const unsigned long __user * src,
 			   unsigned long * dst,
 			   unsigned long soff, unsigned long doff,
-			   long len, unsigned long checksum,
-			   unsigned long partial_dest,
-			   int *errp)
+			   long len, unsigned long partial_dest)
 {
 	unsigned long carry = 0;
 	unsigned long first;
 	unsigned long lastsrc;
-	int err = 0;
+	unsigned long checksum = ~0U;
 
-	err |= __get_user_u(first, src);
+	if (__get_word(ldq_u, first, src))
+		return 0;
 	lastsrc = 7+len+(unsigned long)src;
 	mskql(partial_dest, doff, partial_dest);
 	while (len >= 0) {
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, src+1);
+		if (__get_word(ldq_u, second, src+1))
+			return 0;
 		extql(first, soff, word);
 		checksum += carry;
 		len -= 8;
@@ -286,7 +273,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -307,7 +295,8 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		unsigned long second, word;
 		unsigned long second_dest;
 
-		err |= __get_user_u(second, lastsrc);
+		if (__get_word(ldq_u, second, lastsrc))
+			return 0;
 		extql(first, soff, word);
 		extqh(second, soff, first);
 		word |= first;
@@ -320,63 +309,55 @@ csum_partial_cfu_unaligned(const unsigned long __user * src,
 		stq_u(partial_dest | word | second_dest, dst);
 		checksum += carry;
 	}
-	if (err && errp) *errp = err;
 	return checksum;
 }
 
-__wsum
-csum_and_copy_from_user(const void __user *src, void *dst, int len)
+static __wsum __csum_and_copy(const void __user *src, void *dst, int len)
 {
-	unsigned long checksum = ~0U;
 	unsigned long soff = 7 & (unsigned long) src;
 	unsigned long doff = 7 & (unsigned long) dst;
-	int err = 0;
-
-	if (len) {
-		if (!access_ok(src, len))
-			return 0;
-		if (!doff) {
-			if (!soff)
-				checksum = csum_partial_cfu_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					len-8, checksum, &err);
-			else
-				checksum = csum_partial_cfu_dest_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, len-8, checksum, &err);
-		} else {
-			unsigned long partial_dest;
-			ldq_u(partial_dest, dst);
-			if (!soff)
-				checksum = csum_partial_cfu_src_aligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					doff, len-8, checksum,
-					partial_dest, &err);
-			else
-				checksum = csum_partial_cfu_unaligned(
-					(const unsigned long __user *) src,
-					(unsigned long *) dst,
-					soff, doff, len-8, checksum,
-					partial_dest, &err);
-		}
-		checksum = err ? 0 : from64to16 (checksum);
+	unsigned long checksum;
+
+	if (!doff) {
+		if (!soff)
+			checksum = csum_partial_cfu_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst, len-8);
+		else
+			checksum = csum_partial_cfu_dest_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, len-8);
+	} else {
+		unsigned long partial_dest;
+		ldq_u(partial_dest, dst);
+		if (!soff)
+			checksum = csum_partial_cfu_src_aligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				doff, len-8, partial_dest);
+		else
+			checksum = csum_partial_cfu_unaligned(
+				(const unsigned long __user *) src,
+				(unsigned long *) dst,
+				soff, doff, len-8, partial_dest);
 	}
-	return (__force __wsum)checksum;
+	return (__force __wsum)from64to16 (checksum);
+}
+
+__wsum
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
+{
+	if (!access_ok(src, len))
+		return 0;
+	return __csum_and_copy(src, dst, len);
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	__wsum checksum;
-	mm_segment_t oldfs = get_fs();
-	set_fs(KERNEL_DS);
-	checksum = csum_and_copy_from_user((__force const void __user *)src,
+	return __csum_and_copy((__force const void __user *)src,
 						dst, len);
-	set_fs(oldfs);
-	return checksum;
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 09/20] arm: propagate the calling convention changes down to csum_partial_copy_from_user()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (6 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 08/20] alpha: propagate the calling convention changes down to csum_partial_copy.c helpers Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 10/20] m68k: get rid of zeroing destination on error in csum_and_copy_from_user() Al Viro
                       ` (10 subsequent siblings)
  18 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of the "clean the destination on error" crap.
Simplifies the fault handlers and the function itself...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/arm/include/asm/checksum.h       |  7 ++-----
 arch/arm/lib/csumpartialcopy.S        |  1 -
 arch/arm/lib/csumpartialcopygeneric.S |  1 +
 arch/arm/lib/csumpartialcopyuser.S    | 26 ++++++--------------------
 4 files changed, 9 insertions(+), 26 deletions(-)

diff --git a/arch/arm/include/asm/checksum.h b/arch/arm/include/asm/checksum.h
index 1601c132b064..f0f54aef3724 100644
--- a/arch/arm/include/asm/checksum.h
+++ b/arch/arm/include/asm/checksum.h
@@ -38,20 +38,17 @@ __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len);
 
 __wsum
-csum_partial_copy_from_user(const void __user *src, void *dst, int len, __wsum sum, int *err_ptr);
+csum_partial_copy_from_user(const void __user *src, void *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 #define _HAVE_ARCH_CSUM_AND_COPY
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-
 	if (!access_ok(src, len))
 		return 0;
 
-	sum = csum_partial_copy_from_user(src, dst, len, ~0U, &err);
-	return err ? 0 : sum;
+	return csum_partial_copy_from_user(src, dst, len);
 }
 
 /*
diff --git a/arch/arm/lib/csumpartialcopy.S b/arch/arm/lib/csumpartialcopy.S
index aab914fbc86b..1ca6aadd649c 100644
--- a/arch/arm/lib/csumpartialcopy.S
+++ b/arch/arm/lib/csumpartialcopy.S
@@ -16,7 +16,6 @@
 
 		.macro	save_regs
 		stmfd	sp!, {r1, r4 - r8, lr}
-		mov	r3, #0
 		.endm
 
 		.macro	load_regs
diff --git a/arch/arm/lib/csumpartialcopygeneric.S b/arch/arm/lib/csumpartialcopygeneric.S
index 0b706a39a677..0fd5c10e90a7 100644
--- a/arch/arm/lib/csumpartialcopygeneric.S
+++ b/arch/arm/lib/csumpartialcopygeneric.S
@@ -86,6 +86,7 @@ sum	.req	r3
 
 FN_ENTRY
 		save_regs
+		mov	sum, #-1
 
 		cmp	len, #8			@ Ensure that we have at least
 		blo	.Lless8			@ 8 bytes to copy.
diff --git a/arch/arm/lib/csumpartialcopyuser.S b/arch/arm/lib/csumpartialcopyuser.S
index 6bd3a93eaa3c..6928781e6bee 100644
--- a/arch/arm/lib/csumpartialcopyuser.S
+++ b/arch/arm/lib/csumpartialcopyuser.S
@@ -62,9 +62,9 @@
 
 /*
  * unsigned int
- * csum_partial_copy_from_user(const char *src, char *dst, int len, int sum, int *err_ptr)
- *  r0 = src, r1 = dst, r2 = len, r3 = sum, [sp] = *err_ptr
- *  Returns : r0 = checksum, [[sp, #0], #0] = 0 or -EFAULT
+ * csum_partial_copy_from_user(const char *src, char *dst, int len)
+ *  r0 = src, r1 = dst, r2 = len
+ *  Returns : r0 = checksum or 0
  */
 
 #define FN_ENTRY	ENTRY(csum_partial_copy_from_user)
@@ -73,25 +73,11 @@
 #include "csumpartialcopygeneric.S"
 
 /*
- * FIXME: minor buglet here
- * We don't return the checksum for the data present in the buffer.  To do
- * so properly, we would have to add in whatever registers were loaded before
- * the fault, which, with the current asm above is not predictable.
+ * We report fault by returning 0 csum - impossible in normal case, since
+ * we start with 0xffffffff for initial sum.
  */
 		.pushsection .text.fixup,"ax"
 		.align	4
-9001:		mov	r4, #-EFAULT
-#ifdef CONFIG_CPU_SW_DOMAIN_PAN
-		ldr	r5, [sp, #9*4]		@ *err_ptr
-#else
-		ldr	r5, [sp, #8*4]		@ *err_ptr
-#endif
-		str	r4, [r5]
-		ldmia	sp, {r1, r2}		@ retrieve dst, len
-		add	r2, r2, r1
-		mov	r0, #0			@ zero the buffer
-9002:		teq	r2, r1
-		strbne	r0, [r1], #1
-		bne	9002b
+9001:		mov	r0, #0
 		load_regs
 		.popsection
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 10/20] m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (7 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 09/20] arm: propagate the calling convention changes down to csum_partial_copy_from_user() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 11/20] sh: propage the calling conventions change down to csum_partial_copy_generic() Al Viro
                       ` (9 subsequent siblings)
  18 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/m68k/lib/checksum.c | 79 +++++++++---------------------------------------
 1 file changed, 15 insertions(+), 64 deletions(-)

diff --git a/arch/m68k/lib/checksum.c b/arch/m68k/lib/checksum.c
index 3aeca261f622..7e6afeae6217 100644
--- a/arch/m68k/lib/checksum.c
+++ b/arch/m68k/lib/checksum.c
@@ -236,82 +236,33 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 		"clrl %5\n\t"
 		"addxl %5,%0\n\t"	/* add X bit */
 	     "7:\t"
-		"clrl %5\n"		/* no error - clear return value */
-	     "8:\n"
 		".section .fixup,\"ax\"\n"
 		".even\n"
-		/* If any exception occurs zero out the rest.
-		   Similarities with the code above are intentional :-) */
+		/* If any exception occurs, return 0 */
 	     "90:\t"
-		"clrw %3@+\n\t"
-		"movel %1,%4\n\t"
-		"lsrl #5,%1\n\t"
-		"jeq 1f\n\t"
-		"subql #1,%1\n"
-	     "91:\t"
-		"clrl %3@+\n"
-	     "92:\t"
-		"clrl %3@+\n"
-	     "93:\t"
-		"clrl %3@+\n"
-	     "94:\t"
-		"clrl %3@+\n"
-	     "95:\t"
-		"clrl %3@+\n"
-	     "96:\t"
-		"clrl %3@+\n"
-	     "97:\t"
-		"clrl %3@+\n"
-	     "98:\t"
-		"clrl %3@+\n\t"
-		"dbra %1,91b\n\t"
-		"clrw %1\n\t"
-		"subql #1,%1\n\t"
-		"jcc 91b\n"
-	     "1:\t"
-		"movel %4,%1\n\t"
-		"andw #0x1c,%4\n\t"
-		"jeq 1f\n\t"
-		"lsrw #2,%4\n\t"
-		"subqw #1,%4\n"
-	     "99:\t"
-		"clrl %3@+\n\t"
-		"dbra %4,99b\n\t"
-	     "1:\t"
-		"andw #3,%1\n\t"
-		"jeq 9f\n"
-	     "100:\t"
-		"clrw %3@+\n\t"
-		"tstw %1\n\t"
-		"jeq 9f\n"
-	     "101:\t"
-		"clrb %3@+\n"
-	     "9:\t"
-#define STR(X) STR1(X)
-#define STR1(X) #X
-		"moveq #-" STR(EFAULT) ",%5\n\t"
-		"jra 8b\n"
+		"clrl %0\n"
+		"jra 7b\n"
 		".previous\n"
 		".section __ex_table,\"a\"\n"
 		".long 10b,90b\n"
-		".long 11b,91b\n"
-		".long 12b,92b\n"
-		".long 13b,93b\n"
-		".long 14b,94b\n"
-		".long 15b,95b\n"
-		".long 16b,96b\n"
-		".long 17b,97b\n"
-		".long 18b,98b\n"
-		".long 19b,99b\n"
-		".long 20b,100b\n"
-		".long 21b,101b\n"
+		".long 11b,90b\n"
+		".long 12b,90b\n"
+		".long 13b,90b\n"
+		".long 14b,90b\n"
+		".long 15b,90b\n"
+		".long 16b,90b\n"
+		".long 17b,90b\n"
+		".long 18b,90b\n"
+		".long 19b,90b\n"
+		".long 20b,90b\n"
+		".long 21b,90b\n"
 		".previous"
 		: "=d" (sum), "=d" (len), "=a" (src), "=a" (dst),
 		  "=&d" (tmp1), "=d" (tmp2)
 		: "0" (sum), "1" (len), "2" (src), "3" (dst)
 	    );
 
-	return tmp2 ? 0 : sum;
+	return sum;
 }
 
 EXPORT_SYMBOL(csum_and_copy_from_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 11/20] sh: propage the calling conventions change down to csum_partial_copy_generic()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (8 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 10/20] m68k: get rid of zeroing destination on error in csum_and_copy_from_user() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 12/20] i386: propagate " Al Viro
                       ` (8 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing destination on error there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sh/include/asm/checksum_32.h |  20 ++-----
 arch/sh/lib/checksum.S            | 119 +++++++++++---------------------------
 2 files changed, 39 insertions(+), 100 deletions(-)

diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index a08e8eb924ed..1a391e3a7659 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -30,9 +30,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 /*
@@ -45,21 +43,16 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 static inline
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, &err, NULL);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 
 /*
@@ -202,13 +195,8 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
 					   int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(dst, len))
 		return 0;
-	sum = csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, &err);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sh/lib/checksum.S b/arch/sh/lib/checksum.S
index 97b5c2d9fec4..3e07074e0098 100644
--- a/arch/sh/lib/checksum.S
+++ b/arch/sh/lib/checksum.S
@@ -173,47 +173,27 @@ ENTRY(csum_partial)
 	 mov	r6, r0
 
 /*
-unsigned int csum_partial_copy_generic (const char *src, char *dst, int len, 
-					int sum, int *src_err_ptr, int *dst_err_ptr)
+unsigned int csum_partial_copy_generic (const char *src, char *dst, int len)
  */ 
 
 /*
- * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for all access types.
- *
- * FIXME: could someone double-check whether I haven't mixed up some SRC and
- *	  DST definitions? It's damn hard to trigger all cases.  I hope I got
- *	  them all but there's no guarantee.
+ * Copy from ds while checksumming, otherwise like csum_partial with initial
+ * sum being ~0U
  */
 
-#define SRC(...)			\
+#define EXC(...)			\
 	9999: __VA_ARGS__ ;		\
 	.section __ex_table, "a";	\
 	.long 9999b, 6001f	;	\
 	.previous
 
-#define DST(...)			\
-	9999: __VA_ARGS__ ;		\
-	.section __ex_table, "a";	\
-	.long 9999b, 6002f	;	\
-	.previous
-
 !
 ! r4:	const char *SRC
 ! r5:	char *DST
 ! r6:	int LEN
-! r7:	int SUM
-!
-! on stack:
-! int *SRC_ERR_PTR
-! int *DST_ERR_PTR
 !
 ENTRY(csum_partial_copy_generic)
-	mov.l	r5,@-r15
-	mov.l	r6,@-r15
-
+	mov	#-1,r7
 	mov	#3,r0		! Check src and dest are equally aligned
 	mov	r4,r1
 	and	r0,r1
@@ -243,11 +223,11 @@ ENTRY(csum_partial_copy_generic)
 	clrt
 	.align	2
 5:
-SRC(	mov.b	@r4+,r1 	)
-SRC(	mov.b	@r4+,r0		)
+EXC(	mov.b	@r4+,r1 	)
+EXC(	mov.b	@r4+,r0		)
 	extu.b	r1,r1
-DST(	mov.b	r1,@r5		)
-DST(	mov.b	r0,@(1,r5)	)
+EXC(	mov.b	r1,@r5		)
+EXC(	mov.b	r0,@(1,r5)	)
 	extu.b	r0,r0
 	add	#2,r5
 
@@ -276,8 +256,8 @@ DST(	mov.b	r0,@(1,r5)	)
 	! Handle first two bytes as a special case
 	.align	2
 1:	
-SRC(	mov.w	@r4+,r0		)
-DST(	mov.w	r0,@r5		)
+EXC(	mov.w	@r4+,r0		)
+EXC(	mov.w	r0,@r5		)
 	add	#2,r5
 	extu.w	r0,r0
 	addc	r0,r7
@@ -292,32 +272,32 @@ DST(	mov.w	r0,@r5		)
 	 clrt
 	.align	2
 1:	
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@r5		)
-DST(	mov.l	r1,@(4,r5)	)
+EXC(	mov.l	r0,@r5		)
+EXC(	mov.l	r1,@(4,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(8,r5)	)
-DST(	mov.l	r1,@(12,r5)	)
+EXC(	mov.l	r0,@(8,r5)	)
+EXC(	mov.l	r1,@(12,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0 	)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0 	)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(16,r5)	)
-DST(	mov.l	r1,@(20,r5)	)
+EXC(	mov.l	r0,@(16,r5)	)
+EXC(	mov.l	r1,@(20,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(24,r5)	)
-DST(	mov.l	r1,@(28,r5)	)
+EXC(	mov.l	r0,@(24,r5)	)
+EXC(	mov.l	r1,@(28,r5)	)
 	addc	r1,r7
 	add	#32,r5
 	movt	r0
@@ -335,9 +315,9 @@ DST(	mov.l	r1,@(28,r5)	)
 	 clrt
 	shlr2	r6
 3:	
-SRC(	mov.l	@r4+,r0	)
+EXC(	mov.l	@r4+,r0	)
 	addc	r0,r7
-DST(	mov.l	r0,@r5	)
+EXC(	mov.l	r0,@r5	)
 	add	#4,r5
 	movt	r0
 	dt	r6
@@ -353,8 +333,8 @@ DST(	mov.l	r0,@r5	)
 	mov	#2,r1
 	cmp/hs	r1,r6
 	bf	5f
-SRC(	mov.w	@r4+,r0	)
-DST(	mov.w	r0,@r5	)
+EXC(	mov.w	@r4+,r0	)
+EXC(	mov.w	r0,@r5	)
 	extu.w	r0,r0
 	add	#2,r5
 	cmp/eq	r1,r6
@@ -363,8 +343,8 @@ DST(	mov.w	r0,@r5	)
 	shll16	r0
 	addc	r0,r7
 5:	
-SRC(	mov.b	@r4+,r0	)
-DST(	mov.b	r0,@r5	)
+EXC(	mov.b	@r4+,r0	)
+EXC(	mov.b	r0,@r5	)
 	extu.b	r0,r0
 #ifndef	__LITTLE_ENDIAN__
 	shll8	r0
@@ -373,42 +353,13 @@ DST(	mov.b	r0,@r5	)
 	mov	#0,r0
 	addc	r0,r7
 7:
-5000:
 
 # Exception handler:
 .section .fixup, "ax"							
 
 6001:
-	mov.l	@(8,r15),r0			! src_err_ptr
-	mov	#-EFAULT,r1
-	mov.l	r1,@r0
-
-	! zero the complete destination - computing the rest
-	! is too much work 
-	mov.l	@(4,r15),r5		! dst
-	mov.l	@r15,r6			! len
-	mov	#0,r7
-1:	mov.b	r7,@r5
-	dt	r6
-	bf/s	1b
-	 add	#1,r5
-	mov.l	8000f,r0
-	jmp	@r0
-	 nop
-	.align	2
-8000:	.long	5000b
-
-6002:
-	mov.l	@(12,r15),r0			! dst_err_ptr
-	mov	#-EFAULT,r1
-	mov.l	r1,@r0
-	mov.l	8001f,r0
-	jmp	@r0
-	 nop
-	.align	2
-8001:	.long	5000b
-
+	rts
+	 mov	#0,r0
 .previous
-	add	#8,r15
 	rts
 	 mov	r7,r0
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 11/20] sh: propage the calling conventions change down to csum_partial_copy_generic()
  2020-07-24  1:25     ` [PATCH v2 11/20] sh: propage the calling conventions change down to csum_partial_copy_generic() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing destination on error there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sh/include/asm/checksum_32.h |  20 ++-----
 arch/sh/lib/checksum.S            | 119 +++++++++++---------------------------
 2 files changed, 39 insertions(+), 100 deletions(-)

diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index a08e8eb924ed..1a391e3a7659 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -30,9 +30,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 /*
@@ -45,21 +43,16 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 static inline
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = csum_partial_copy_generic((__force const void *)src, dst,
-					len, sum, &err, NULL);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 
 /*
@@ -202,13 +195,8 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   void __user *dst,
 					   int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(dst, len))
 		return 0;
-	sum = csum_partial_copy_generic((__force const void *)src,
-						dst, len, sum, NULL, &err);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sh/lib/checksum.S b/arch/sh/lib/checksum.S
index 97b5c2d9fec4..3e07074e0098 100644
--- a/arch/sh/lib/checksum.S
+++ b/arch/sh/lib/checksum.S
@@ -173,47 +173,27 @@ ENTRY(csum_partial)
 	 mov	r6, r0
 
 /*
-unsigned int csum_partial_copy_generic (const char *src, char *dst, int len, 
-					int sum, int *src_err_ptr, int *dst_err_ptr)
+unsigned int csum_partial_copy_generic (const char *src, char *dst, int len)
  */ 
 
 /*
- * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for all access types.
- *
- * FIXME: could someone double-check whether I haven't mixed up some SRC and
- *	  DST definitions? It's damn hard to trigger all cases.  I hope I got
- *	  them all but there's no guarantee.
+ * Copy from ds while checksumming, otherwise like csum_partial with initial
+ * sum being ~0U
  */
 
-#define SRC(...)			\
+#define EXC(...)			\
 	9999: __VA_ARGS__ ;		\
 	.section __ex_table, "a";	\
 	.long 9999b, 6001f	;	\
 	.previous
 
-#define DST(...)			\
-	9999: __VA_ARGS__ ;		\
-	.section __ex_table, "a";	\
-	.long 9999b, 6002f	;	\
-	.previous
-
 !
 ! r4:	const char *SRC
 ! r5:	char *DST
 ! r6:	int LEN
-! r7:	int SUM
-!
-! on stack:
-! int *SRC_ERR_PTR
-! int *DST_ERR_PTR
 !
 ENTRY(csum_partial_copy_generic)
-	mov.l	r5,@-r15
-	mov.l	r6,@-r15
-
+	mov	#-1,r7
 	mov	#3,r0		! Check src and dest are equally aligned
 	mov	r4,r1
 	and	r0,r1
@@ -243,11 +223,11 @@ ENTRY(csum_partial_copy_generic)
 	clrt
 	.align	2
 5:
-SRC(	mov.b	@r4+,r1 	)
-SRC(	mov.b	@r4+,r0		)
+EXC(	mov.b	@r4+,r1 	)
+EXC(	mov.b	@r4+,r0		)
 	extu.b	r1,r1
-DST(	mov.b	r1,@r5		)
-DST(	mov.b	r0,@(1,r5)	)
+EXC(	mov.b	r1,@r5		)
+EXC(	mov.b	r0,@(1,r5)	)
 	extu.b	r0,r0
 	add	#2,r5
 
@@ -276,8 +256,8 @@ DST(	mov.b	r0,@(1,r5)	)
 	! Handle first two bytes as a special case
 	.align	2
 1:	
-SRC(	mov.w	@r4+,r0		)
-DST(	mov.w	r0,@r5		)
+EXC(	mov.w	@r4+,r0		)
+EXC(	mov.w	r0,@r5		)
 	add	#2,r5
 	extu.w	r0,r0
 	addc	r0,r7
@@ -292,32 +272,32 @@ DST(	mov.w	r0,@r5		)
 	 clrt
 	.align	2
 1:	
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@r5		)
-DST(	mov.l	r1,@(4,r5)	)
+EXC(	mov.l	r0,@r5		)
+EXC(	mov.l	r1,@(4,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(8,r5)	)
-DST(	mov.l	r1,@(12,r5)	)
+EXC(	mov.l	r0,@(8,r5)	)
+EXC(	mov.l	r1,@(12,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0 	)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0 	)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(16,r5)	)
-DST(	mov.l	r1,@(20,r5)	)
+EXC(	mov.l	r0,@(16,r5)	)
+EXC(	mov.l	r1,@(20,r5)	)
 	addc	r1,r7
 
-SRC(	mov.l	@r4+,r0		)
-SRC(	mov.l	@r4+,r1		)
+EXC(	mov.l	@r4+,r0		)
+EXC(	mov.l	@r4+,r1		)
 	addc	r0,r7
-DST(	mov.l	r0,@(24,r5)	)
-DST(	mov.l	r1,@(28,r5)	)
+EXC(	mov.l	r0,@(24,r5)	)
+EXC(	mov.l	r1,@(28,r5)	)
 	addc	r1,r7
 	add	#32,r5
 	movt	r0
@@ -335,9 +315,9 @@ DST(	mov.l	r1,@(28,r5)	)
 	 clrt
 	shlr2	r6
 3:	
-SRC(	mov.l	@r4+,r0	)
+EXC(	mov.l	@r4+,r0	)
 	addc	r0,r7
-DST(	mov.l	r0,@r5	)
+EXC(	mov.l	r0,@r5	)
 	add	#4,r5
 	movt	r0
 	dt	r6
@@ -353,8 +333,8 @@ DST(	mov.l	r0,@r5	)
 	mov	#2,r1
 	cmp/hs	r1,r6
 	bf	5f
-SRC(	mov.w	@r4+,r0	)
-DST(	mov.w	r0,@r5	)
+EXC(	mov.w	@r4+,r0	)
+EXC(	mov.w	r0,@r5	)
 	extu.w	r0,r0
 	add	#2,r5
 	cmp/eq	r1,r6
@@ -363,8 +343,8 @@ DST(	mov.w	r0,@r5	)
 	shll16	r0
 	addc	r0,r7
 5:	
-SRC(	mov.b	@r4+,r0	)
-DST(	mov.b	r0,@r5	)
+EXC(	mov.b	@r4+,r0	)
+EXC(	mov.b	r0,@r5	)
 	extu.b	r0,r0
 #ifndef	__LITTLE_ENDIAN__
 	shll8	r0
@@ -373,42 +353,13 @@ DST(	mov.b	r0,@r5	)
 	mov	#0,r0
 	addc	r0,r7
 7:
-5000:
 
 # Exception handler:
 .section .fixup, "ax"							
 
 6001:
-	mov.l	@(8,r15),r0			! src_err_ptr
-	mov	#-EFAULT,r1
-	mov.l	r1,@r0
-
-	! zero the complete destination - computing the rest
-	! is too much work 
-	mov.l	@(4,r15),r5		! dst
-	mov.l	@r15,r6			! len
-	mov	#0,r7
-1:	mov.b	r7,@r5
-	dt	r6
-	bf/s	1b
-	 add	#1,r5
-	mov.l	8000f,r0
-	jmp	@r0
-	 nop
-	.align	2
-8000:	.long	5000b
-
-6002:
-	mov.l	@(12,r15),r0			! dst_err_ptr
-	mov	#-EFAULT,r1
-	mov.l	r1,@r0
-	mov.l	8001f,r0
-	jmp	@r0
-	 nop
-	.align	2
-8001:	.long	5000b
-
+	rts
+	 mov	#0,r0
 .previous
-	add	#8,r15
 	rts
 	 mov	r7,r0
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 12/20] i386: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (9 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 11/20] sh: propage the calling conventions change down to csum_partial_copy_generic() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 13/20] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
                       ` (7 subsequent siblings)
  18 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and don't bother zeroing destination on error

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_32.h |  18 ++----
 arch/x86/lib/checksum_32.S         | 117 +++++++++++++------------------------
 2 files changed, 47 insertions(+), 88 deletions(-)

diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index 5948cde9e4ad..17da95387997 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -27,9 +27,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 /*
  *	Note: when you get a NULL pointer exception here this means someone
@@ -40,23 +38,21 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
  */
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 static inline __wsum csum_and_copy_from_user(const void __user *src,
 					     void *dst, int len)
 {
 	__wsum ret;
-	int err = 0;
 
 	might_sleep();
 	if (!user_access_begin(src, len))
 		return 0;
-	ret = csum_partial_copy_generic((__force void *)src, dst,
-					len, ~0U, &err, NULL);
+	ret = csum_partial_copy_generic((__force void *)src, dst, len);
 	user_access_end();
 
-	return err ? 0 : ret;
+	return ret;
 }
 
 /*
@@ -177,16 +173,14 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   int len)
 {
 	__wsum ret;
-	int err = 0;
 
 	might_sleep();
 	if (!user_access_begin(dst, len))
 		return 0;
 
-	ret = csum_partial_copy_generic(src, (__force void *)dst,
-					len, ~0U, NULL, &err);
+	ret = csum_partial_copy_generic(src, (__force void *)dst, len);
 	user_access_end();
-	return err ? 0 : ret;
+	return ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index d1d768912368..4304320e51f4 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -253,28 +253,17 @@ EXPORT_SYMBOL(csum_partial)
 
 /*
 unsigned int csum_partial_copy_generic (const char *src, char *dst,
-				  int len, int sum, int *src_err_ptr, int *dst_err_ptr)
+				  int len)
  */ 
 
 /*
  * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for all access types.
- *
- * FIXME: could someone double-check whether I haven't mixed up some SRC and
- *	  DST definitions? It's damn hard to trigger all cases.  I hope I got
- *	  them all but there's no guarantee.
  */
 
-#define SRC(y...)			\
+#define EXC(y...)			\
 	9999: y;			\
 	_ASM_EXTABLE_UA(9999b, 6001f)
 
-#define DST(y...)			\
-	9999: y;			\
-	_ASM_EXTABLE_UA(9999b, 6002f)
-
 #ifndef CONFIG_X86_USE_PPRO_CHECKSUM
 
 #define ARGBASE 16		
@@ -285,20 +274,20 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	pushl %edi
 	pushl %esi
 	pushl %ebx
-	movl ARGBASE+16(%esp),%eax	# sum
 	movl ARGBASE+12(%esp),%ecx	# len
 	movl ARGBASE+4(%esp),%esi	# src
 	movl ARGBASE+8(%esp),%edi	# dst
 
+	movl $-1, %eax			# sum
 	testl $2, %edi			# Check alignment. 
 	jz 2f				# Jump if alignment is ok.
 	subl $2, %ecx			# Alignment uses up two bytes.
 	jae 1f				# Jump if we had at least two bytes.
 	addl $2, %ecx			# ecx was < 2.  Deal with it.
 	jmp 4f
-SRC(1:	movw (%esi), %bx	)
+EXC(1:	movw (%esi), %bx	)
 	addl $2, %esi
-DST(	movw %bx, (%edi)	)
+EXC(	movw %bx, (%edi)	)
 	addl $2, %edi
 	addw %bx, %ax	
 	adcl $0, %eax
@@ -306,34 +295,34 @@ DST(	movw %bx, (%edi)	)
 	movl %ecx, FP(%esp)
 	shrl $5, %ecx
 	jz 2f
-	testl %esi, %esi
-SRC(1:	movl (%esi), %ebx	)
-SRC(	movl 4(%esi), %edx	)
+	testl %esi, %esi		# what's wrong with clc?
+EXC(1:	movl (%esi), %ebx	)
+EXC(	movl 4(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, (%edi)	)
+EXC(	movl %ebx, (%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 4(%edi)	)
+EXC(	movl %edx, 4(%edi)	)
 
-SRC(	movl 8(%esi), %ebx	)
-SRC(	movl 12(%esi), %edx	)
+EXC(	movl 8(%esi), %ebx	)
+EXC(	movl 12(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 8(%edi)	)
+EXC(	movl %ebx, 8(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 12(%edi)	)
+EXC(	movl %edx, 12(%edi)	)
 
-SRC(	movl 16(%esi), %ebx 	)
-SRC(	movl 20(%esi), %edx	)
+EXC(	movl 16(%esi), %ebx 	)
+EXC(	movl 20(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 16(%edi)	)
+EXC(	movl %ebx, 16(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 20(%edi)	)
+EXC(	movl %edx, 20(%edi)	)
 
-SRC(	movl 24(%esi), %ebx	)
-SRC(	movl 28(%esi), %edx	)
+EXC(	movl 24(%esi), %ebx	)
+EXC(	movl 28(%esi), %edx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, 24(%edi)	)
+EXC(	movl %ebx, 24(%edi)	)
 	adcl %edx, %eax
-DST(	movl %edx, 28(%edi)	)
+EXC(	movl %edx, 28(%edi)	)
 
 	lea 32(%esi), %esi
 	lea 32(%edi), %edi
@@ -345,9 +334,9 @@ DST(	movl %edx, 28(%edi)	)
 	andl $0x1c, %edx
 	je 4f
 	shrl $2, %edx			# This clears CF
-SRC(3:	movl (%esi), %ebx	)
+EXC(3:	movl (%esi), %ebx	)
 	adcl %ebx, %eax
-DST(	movl %ebx, (%edi)	)
+EXC(	movl %ebx, (%edi)	)
 	lea 4(%esi), %esi
 	lea 4(%edi), %edi
 	dec %edx
@@ -357,39 +346,24 @@ DST(	movl %ebx, (%edi)	)
 	jz 7f
 	cmpl $2, %ecx
 	jb 5f
-SRC(	movw (%esi), %cx	)
+EXC(	movw (%esi), %cx	)
 	leal 2(%esi), %esi
-DST(	movw %cx, (%edi)	)
+EXC(	movw %cx, (%edi)	)
 	leal 2(%edi), %edi
 	je 6f
 	shll $16,%ecx
-SRC(5:	movb (%esi), %cl	)
-DST(	movb %cl, (%edi)	)
+EXC(5:	movb (%esi), %cl	)
+EXC(	movb %cl, (%edi)	)
 6:	addl %ecx, %eax
 	adcl $0, %eax
 7:
-5000:
 
 # Exception handler:
 .section .fixup, "ax"							
 
 6001:
-	movl ARGBASE+20(%esp), %ebx	# src_err_ptr
-	movl $-EFAULT, (%ebx)
-
-	# zero the complete destination - computing the rest
-	# is too much work 
-	movl ARGBASE+8(%esp), %edi	# dst
-	movl ARGBASE+12(%esp), %ecx	# len
-	xorl %eax,%eax
-	rep ; stosb
-
-	jmp 5000b
-
-6002:
-	movl ARGBASE+24(%esp), %ebx	# dst_err_ptr
-	movl $-EFAULT,(%ebx)
-	jmp 5000b
+	xorl %eax, %eax
+	jmp 7b
 
 .previous
 
@@ -405,14 +379,14 @@ SYM_FUNC_END(csum_partial_copy_generic)
 /* Version for PentiumII/PPro */
 
 #define ROUND1(x) \
-	SRC(movl x(%esi), %ebx	)	;	\
+	EXC(movl x(%esi), %ebx	)	;	\
 	addl %ebx, %eax			;	\
-	DST(movl %ebx, x(%edi)	)	; 
+	EXC(movl %ebx, x(%edi)	)	;
 
 #define ROUND(x) \
-	SRC(movl x(%esi), %ebx	)	;	\
+	EXC(movl x(%esi), %ebx	)	;	\
 	adcl %ebx, %eax			;	\
-	DST(movl %ebx, x(%edi)	)	;
+	EXC(movl %ebx, x(%edi)	)	;
 
 #define ARGBASE 12
 		
@@ -423,7 +397,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	movl ARGBASE+4(%esp),%esi	#src
 	movl ARGBASE+8(%esp),%edi	#dst	
 	movl ARGBASE+12(%esp),%ecx	#len
-	movl ARGBASE+16(%esp),%eax	#sum
+	movl $-1, %eax			#sum
 #	movl %ecx, %edx  
 	movl %ecx, %ebx  
 	movl %esi, %edx
@@ -439,7 +413,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	JMP_NOSPEC ebx
 1:	addl $64,%esi
 	addl $64,%edi 
-	SRC(movb -32(%edx),%bl)	; SRC(movb (%edx),%bl)
+	EXC(movb -32(%edx),%bl)	; EXC(movb (%edx),%bl)
 	ROUND1(-64) ROUND(-60) ROUND(-56) ROUND(-52)	
 	ROUND (-48) ROUND(-44) ROUND(-40) ROUND(-36)	
 	ROUND (-32) ROUND(-28) ROUND(-24) ROUND(-20)	
@@ -453,29 +427,20 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	jz 7f
 	cmpl $2, %edx
 	jb 5f
-SRC(	movw (%esi), %dx         )
+EXC(	movw (%esi), %dx         )
 	leal 2(%esi), %esi
-DST(	movw %dx, (%edi)         )
+EXC(	movw %dx, (%edi)         )
 	leal 2(%edi), %edi
 	je 6f
 	shll $16,%edx
 5:
-SRC(	movb (%esi), %dl         )
-DST(	movb %dl, (%edi)         )
+EXC(	movb (%esi), %dl         )
+EXC(	movb %dl, (%edi)         )
 6:	addl %edx, %eax
 	adcl $0, %eax
 7:
 .section .fixup, "ax"
-6001:	movl	ARGBASE+20(%esp), %ebx	# src_err_ptr	
-	movl $-EFAULT, (%ebx)
-	# zero the complete destination (computing the rest is too much work)
-	movl ARGBASE+8(%esp),%edi	# dst
-	movl ARGBASE+12(%esp),%ecx	# len
-	xorl %eax,%eax
-	rep; stosb
-	jmp 7b
-6002:	movl ARGBASE+24(%esp), %ebx	# dst_err_ptr
-	movl $-EFAULT, (%ebx)
+6001:	xorl %eax, %eax
 	jmp  7b			
 .previous				
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 13/20] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (10 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 12/20] i386: propagate " Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 14/20] mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS Al Viro
                       ` (6 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing the target, etc. on fault.
All exception handlers merge into one; moreover, since we are not
calling lookup_fault() anymore, we don't need the magic with passing
arguments for it from the page fault handler.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum_32.h |  49 +--------
 arch/sparc/lib/checksum_32.S         | 202 +++++++----------------------------
 arch/sparc/mm/fault_32.c             |   6 +-
 3 files changed, 44 insertions(+), 213 deletions(-)

diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index b5873b7b7bf0..d55e480172a6 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -50,9 +50,9 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 	__asm__ __volatile__ (
 		"call __csum_partial_copy_sparc_generic\n\t"
-		" mov %6, %%g7\n"
+		" mov -1, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (0)
+	: "0" (ret), "1" (d), "2" (l)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
@@ -61,29 +61,10 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src, void *dst, int len)
-  {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
+{
 	if (unlikely(!access_ok(src, len)))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,2\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
 #define HAVE_CSUM_COPY_USER
@@ -91,29 +72,9 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char __user *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,1\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7",
-	  "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck(src, (__force void *)dst, len);
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/lib/checksum_32.S b/arch/sparc/lib/checksum_32.S
index 6a5469c97246..7488d130faf7 100644
--- a/arch/sparc/lib/checksum_32.S
+++ b/arch/sparc/lib/checksum_32.S
@@ -144,44 +144,21 @@ cpte:	bne	csum_partial_end_cruft			! yep, handle it
 cpout:	retl						! get outta here
 	 mov	%o2, %o0				! return computed csum
 
-	.globl __csum_partial_copy_start, __csum_partial_copy_end
-__csum_partial_copy_start:
-
 /* Work around cpp -rob */
 #define ALLOC #alloc
 #define EXECINSTR #execinstr
-#define EX(x,y,a,b)				\
-98:     x,y;                                    \
-        .section .fixup,ALLOC,EXECINSTR;	\
-        .align  4;                              \
-99:     ba 30f;                                 \
-         a, b, %o3;                             \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 99b;                       \
-        .text;                                  \
-        .align  4
-
-#define EX2(x,y)				\
-98:     x,y;                                    \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 30f;                       \
-        .text;                                  \
-        .align  4
-
-#define EX3(x,y)				\
+#define EX(x,y)					\
 98:     x,y;                                    \
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   98b, 96f;                       \
+        .word   98b, cc_fault;                   \
         .text;                                  \
         .align  4
 
-#define EXT(start,end,handler)			\
+#define EXT(start,end)				\
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   start, 0, end, handler;         \
+        .word   start, 0, end, cc_fault;         \
         .text;                                  \
         .align  4
 
@@ -252,21 +229,21 @@ __csum_partial_copy_start:
 cc_end_cruft:
 	be	1f
 	 andcc	%o3, 4, %g0
-	EX(ldd	[%o0 + 0x00], %g2, and %o3, 0xf)
+	EX(ldd	[%o0 + 0x00], %g2)
 	add	%o1, 8, %o1
 	addcc	%g2, %g7, %g7
 	add	%o0, 8, %o0
 	addxcc	%g3, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x08])
+	EX(st	%g2, [%o1 - 0x08])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 4, %g0
-	EX2(st	%g3, [%o1 - 0x04])
+	EX(st	%g3, [%o1 - 0x04])
 1:	be	1f
 	 andcc	%o3, 3, %o3
-	EX(ld	[%o0 + 0x00], %g2, add %o3, 4)
+	EX(ld	[%o0 + 0x00], %g2)
 	add	%o1, 4, %o1
 	addcc	%g2, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x04])
+	EX(st	%g2, [%o1 - 0x04])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 3, %g0
 	add	%o0, 4, %o0
@@ -276,14 +253,14 @@ cc_end_cruft:
 	 subcc	%o3, 2, %o3
 	b	4f
 	 or	%g0, %g0, %o4
-2:	EX(lduh	[%o0 + 0x00], %o4, add %o3, 2)
+2:	EX(lduh	[%o0 + 0x00], %o4)
 	add	%o0, 2, %o0
-	EX2(sth	%o4, [%o1 + 0x00])
+	EX(sth	%o4, [%o1 + 0x00])
 	be	6f
 	 add	%o1, 2, %o1
 	sll	%o4, 16, %o4
-4:	EX(ldub	[%o0 + 0x00], %o5, add %g0, 1)
-	EX2(stb	%o5, [%o1 + 0x00])
+4:	EX(ldub	[%o0 + 0x00], %o5)
+	EX(stb	%o5, [%o1 + 0x00])
 	sll	%o5, 8, %o5
 	or	%o5, %o4, %o4
 6:	addcc	%o4, %g7, %g7
@@ -306,9 +283,9 @@ cc_dword_align:
 	 andcc	%o0, 0x2, %g0
 	be	1f
 	 andcc	%o0, 0x4, %g0
-	EX(lduh	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(lduh	[%o0 + 0x00], %g4)
 	sub	%g1, 2, %g1
-	EX2(sth	%g4, [%o1 + 0x00])
+	EX(sth	%g4, [%o1 + 0x00])
 	add	%o0, 2, %o0
 	sll	%g4, 16, %g4
 	addcc	%g4, %g7, %g7
@@ -322,9 +299,9 @@ cc_dword_align:
 	or	%g3, %g7, %g7
 1:	be	3f
 	 andcc	%g1, 0xffffff80, %g0
-	EX(ld	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(ld	[%o0 + 0x00], %g4)
 	sub	%g1, 4, %g1
-	EX2(st	%g4, [%o1 + 0x00])
+	EX(st	%g4, [%o1 + 0x00])
 	add	%o0, 4, %o0
 	addcc	%g4, %g7, %g7
 	add	%o1, 4, %o1
@@ -354,7 +331,7 @@ __csum_partial_copy_sparc_generic:
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-10:	EXT(5b, 10b, 20f)		! note for exception handling
+10:	EXT(5b, 10b)			! note for exception handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -379,7 +356,7 @@ cctbl:	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x68,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x28,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x18,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x08,%g2,%g3,%g4,%g5)
-12:	EXT(cctbl, 12b, 22f)		! note for exception table handling
+12:	EXT(cctbl, 12b)			! note for exception table handling
 	addx	%g0, %g7, %g7
 	andcc	%o3, 0xf, %g0		! check for low bits set
 ccte:	bne	cc_end_cruft		! something left, handle it out of band
@@ -390,7 +367,7 @@ ccdbl:	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x00,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-11:	EXT(ccdbl, 11b, 21f)		! note for exception table handling
+11:	EXT(ccdbl, 11b)			! note for exception table handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -407,9 +384,9 @@ ccslow:	cmp	%g1, 0
 	be,a	1f
 	 srl	%g1, 1, %g4		
 	sub	%g1, 1, %g1	
-	EX(ldub	[%o0], %g5, add %g1, 1)
+	EX(ldub	[%o0], %g5)
 	add	%o0, 1, %o0	
-	EX2(stb	%g5, [%o1])
+	EX(stb	%g5, [%o1])
 	srl	%g1, 1, %g4
 	add	%o1, 1, %o1
 1:	cmp	%g4, 0		
@@ -418,34 +395,34 @@ ccslow:	cmp	%g1, 0
 	andcc	%o0, 2, %g0	
 	be,a	1f
 	 srl	%g4, 1, %g4
-	EX(lduh	[%o0], %o4, add %g1, 0)
+	EX(lduh	[%o0], %o4)
 	sub	%g1, 2, %g1	
 	srl	%o4, 8, %g2
 	sub	%g4, 1, %g4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o0, 2, %o0	
 	srl	%g4, 1, %g4
 	add	%o1, 2, %o1
 1:	cmp	%g4, 0		
 	be,a	2f
 	 andcc	%g1, 2, %g0
-	EX3(ld	[%o0], %o4)
+	EX(ld	[%o0], %o4)
 5:	srl	%o4, 24, %g2
 	srl	%o4, 16, %g3
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	srl	%o4, 8, %g2
-	EX2(stb	%g3, [%o1 + 1])
+	EX(stb	%g3, [%o1 + 1])
 	add	%o0, 4, %o0
-	EX2(stb	%g2, [%o1 + 2])
+	EX(stb	%g2, [%o1 + 2])
 	addcc	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 3])
+	EX(stb	%o4, [%o1 + 3])
 	addx	%g5, %g0, %g5	! I am now to lazy to optimize this (question it
 	add	%o1, 4, %o1	! is worthy). Maybe some day - with the sll/srl
 	subcc	%g4, 1, %g4	! tricks
 	bne,a	5b
-	 EX3(ld	[%o0], %o4)
+	 EX(ld	[%o0], %o4)
 	sll	%g5, 16, %g2
 	srl	%g5, 16, %g5
 	srl	%g2, 16, %g2
@@ -453,19 +430,19 @@ ccslow:	cmp	%g1, 0
 	add	%g2, %g5, %g5 
 2:	be,a	3f		
 	 andcc	%g1, 1, %g0
-	EX(lduh	[%o0], %o4, and %g1, 3)
+	EX(lduh	[%o0], %o4)
 	andcc	%g1, 1, %g0
 	srl	%o4, 8, %g2
 	add	%o0, 2, %o0	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o1, 2, %o1
 3:	be,a	1f		
 	 sll	%g5, 16, %o4
-	EX(ldub	[%o0], %g2, add %g0, 1)
+	EX(ldub	[%o0], %g2)
 	sll	%g2, 8, %o4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
 	sll	%g5, 16, %o4
 1:	addcc	%o4, %g5, %g5
@@ -481,113 +458,10 @@ ccslow:	cmp	%g1, 0
 4:	addcc	%g7, %g5, %g7
 	retl	
 	 addx	%g0, %g7, %o0
-__csum_partial_copy_end:
 
 /* We do these strange calculations for the csum_*_from_user case only, ie.
  * we only bother with faults on loads... */
 
-/* o2 = ((g2%20)&3)*8
- * o3 = g1 - (g2/20)*32 - o2 */
-20:
-	cmp	%g2, 20
-	blu,a	1f
-	 and	%g2, 3, %o2
-	sub	%g1, 32, %g1
-	b	20b
-	 sub	%g2, 20, %g2
-1:
-	sll	%o2, 3, %o2
-	b	31f
-	 sub	%g1, %o2, %o3
-
-/* o2 = (!(g2 & 15) ? 0 : (((g2 & 15) + 1) & ~1)*8)
- * o3 = g1 - (g2/16)*32 - o2 */
-21:
-	andcc	%g2, 15, %o3
-	srl	%g2, 4, %g2
-	be,a	1f
-	 clr	%o2
-	add	%o3, 1, %o3
-	and	%o3, 14, %o3
-	sll	%o3, 3, %o2
-1:
-	sll	%g2, 5, %g2
-	sub	%g1, %g2, %o3
-	b	31f
-	 sub	%o3, %o2, %o3
-
-/* o0 += (g2/10)*16 - 0x70
- * 01 += (g2/10)*16 - 0x70
- * o2 = (g2 % 10) ? 8 : 0
- * o3 += 0x70 - (g2/10)*16 - o2 */
-22:
-	cmp	%g2, 10
-	blu,a	1f
-	 sub	%o0, 0x70, %o0
-	add	%o0, 16, %o0
-	add	%o1, 16, %o1
-	sub	%o3, 16, %o3
-	b	22b
-	 sub	%g2, 10, %g2
-1:
-	sub	%o1, 0x70, %o1
-	add	%o3, 0x70, %o3
-	clr	%o2
-	tst	%g2
-	bne,a	1f
-	 mov	8, %o2
-1:
-	b	31f
-	 sub	%o3, %o2, %o3
-96:
-	and	%g1, 3, %g1
-	sll	%g4, 2, %g4
-	add	%g1, %g4, %o3
-30:
-/* %o1 is dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	clr	%o2
-31:
-/* %o0 is src
- * %o1 is dst
- * %o2 is # of bytes to copy from src to dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	save	%sp, -104, %sp
-        mov     %i5, %o0
-        mov     %i7, %o1
-        mov	%i4, %o2
-        call    lookup_fault
-	 mov	%g7, %i4
-	cmp	%o0, 2
-	bne	1f	
-	 add	%g0, -EFAULT, %i5
-	tst	%i2
-	be	2f
-	 mov	%i0, %o1
-	mov	%i1, %o0
-5:
-	call	memcpy
-	 mov	%i2, %o2
-	tst	%o0
-	bne,a	2f
-	 add	%i3, %i2, %i3
-	add	%i1, %i2, %i1
-2:
-	mov	%i1, %o0
-6:
-	call	__bzero
-	 mov	%i3, %o1
-1:
-	ld	[%sp + 168], %o2		! struct_ptr of parent
-	st	%i5, [%o2]
+cc_fault:
 	ret
-	 restore
-
-        .section __ex_table,#alloc
-        .align 4
-        .word 5b,2
-	.word 6b,2
+	 clr	%o0
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index cfef656eda0f..1185b6169144 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -297,8 +297,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 		if (fixup > 10) {
 			extern const unsigned int __memset_start[];
 			extern const unsigned int __memset_end[];
-			extern const unsigned int __csum_partial_copy_start[];
-			extern const unsigned int __csum_partial_copy_end[];
 
 #ifdef DEBUG_EXCEPTIONS
 			printk("Exception: PC<%08lx> faddr<%08lx>\n",
@@ -307,9 +305,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 				regs->pc, fixup, g2);
 #endif
 			if ((regs->pc >= (unsigned long)__memset_start &&
-			     regs->pc < (unsigned long)__memset_end) ||
-			    (regs->pc >= (unsigned long)__csum_partial_copy_start &&
-			     regs->pc < (unsigned long)__csum_partial_copy_end)) {
+			     regs->pc < (unsigned long)__memset_end)) {
 				regs->u_regs[UREG_I4] = address;
 				regs->u_regs[UREG_I5] = regs->pc;
 			}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 13/20] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
  2020-07-24  1:25     ` [PATCH v2 13/20] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of zeroing the target, etc. on fault.
All exception handlers merge into one; moreover, since we are not
calling lookup_fault() anymore, we don't need the magic with passing
arguments for it from the page fault handler.

Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum_32.h |  49 +--------
 arch/sparc/lib/checksum_32.S         | 202 +++++++----------------------------
 arch/sparc/mm/fault_32.c             |   6 +-
 3 files changed, 44 insertions(+), 213 deletions(-)

diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index b5873b7b7bf0..d55e480172a6 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -50,9 +50,9 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 	__asm__ __volatile__ (
 		"call __csum_partial_copy_sparc_generic\n\t"
-		" mov %6, %%g7\n"
+		" mov -1, %%g7\n"
 	: "=&r" (ret), "=&r" (d), "=&r" (l)
-	: "0" (ret), "1" (d), "2" (l), "r" (0)
+	: "0" (ret), "1" (d), "2" (l)
 	: "o2", "o3", "o4", "o5", "o7",
 	  "g2", "g3", "g4", "g5", "g7",
 	  "memory", "cc");
@@ -61,29 +61,10 @@ csum_partial_copy_nocheck(const void *src, void *dst, int len)
 
 static inline __wsum
 csum_and_copy_from_user(const void __user *src, void *dst, int len)
-  {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
+{
 	if (unlikely(!access_ok(src, len)))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,2\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7", "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
 #define HAVE_CSUM_COPY_USER
@@ -91,29 +72,9 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	register unsigned long ret asm("o0") = (unsigned long)src;
-	register char __user *d asm("o1") = dst;
-	register int l asm("g1") = len;
-	register __wsum s asm("g7") = ~0U;
-	int err = 0;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	__asm__ __volatile__ (
-	".section __ex_table,#alloc\n\t"
-	".align 4\n\t"
-	".word 1f,1\n\t"
-	".previous\n"
-	"1:\n\t"
-	"call __csum_partial_copy_sparc_generic\n\t"
-	" st %8, [%%sp + 64]\n"
-	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (&err)
-	: "o2", "o3", "o4", "o5", "o7",
-	  "g2", "g3", "g4", "g5",
-	  "cc", "memory");
-	return err ? 0 : (__force __wsum)ret;
+	return csum_partial_copy_nocheck(src, (__force void *)dst, len);
 }
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
diff --git a/arch/sparc/lib/checksum_32.S b/arch/sparc/lib/checksum_32.S
index 6a5469c97246..7488d130faf7 100644
--- a/arch/sparc/lib/checksum_32.S
+++ b/arch/sparc/lib/checksum_32.S
@@ -144,44 +144,21 @@ cpte:	bne	csum_partial_end_cruft			! yep, handle it
 cpout:	retl						! get outta here
 	 mov	%o2, %o0				! return computed csum
 
-	.globl __csum_partial_copy_start, __csum_partial_copy_end
-__csum_partial_copy_start:
-
 /* Work around cpp -rob */
 #define ALLOC #alloc
 #define EXECINSTR #execinstr
-#define EX(x,y,a,b)				\
-98:     x,y;                                    \
-        .section .fixup,ALLOC,EXECINSTR;	\
-        .align  4;                              \
-99:     ba 30f;                                 \
-         a, b, %o3;                             \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 99b;                       \
-        .text;                                  \
-        .align  4
-
-#define EX2(x,y)				\
-98:     x,y;                                    \
-        .section __ex_table,ALLOC;		\
-        .align  4;                              \
-        .word   98b, 30f;                       \
-        .text;                                  \
-        .align  4
-
-#define EX3(x,y)				\
+#define EX(x,y)					\
 98:     x,y;                                    \
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   98b, 96f;                       \
+        .word   98b, cc_fault;                   \
         .text;                                  \
         .align  4
 
-#define EXT(start,end,handler)			\
+#define EXT(start,end)				\
         .section __ex_table,ALLOC;		\
         .align  4;                              \
-        .word   start, 0, end, handler;         \
+        .word   start, 0, end, cc_fault;         \
         .text;                                  \
         .align  4
 
@@ -252,21 +229,21 @@ __csum_partial_copy_start:
 cc_end_cruft:
 	be	1f
 	 andcc	%o3, 4, %g0
-	EX(ldd	[%o0 + 0x00], %g2, and %o3, 0xf)
+	EX(ldd	[%o0 + 0x00], %g2)
 	add	%o1, 8, %o1
 	addcc	%g2, %g7, %g7
 	add	%o0, 8, %o0
 	addxcc	%g3, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x08])
+	EX(st	%g2, [%o1 - 0x08])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 4, %g0
-	EX2(st	%g3, [%o1 - 0x04])
+	EX(st	%g3, [%o1 - 0x04])
 1:	be	1f
 	 andcc	%o3, 3, %o3
-	EX(ld	[%o0 + 0x00], %g2, add %o3, 4)
+	EX(ld	[%o0 + 0x00], %g2)
 	add	%o1, 4, %o1
 	addcc	%g2, %g7, %g7
-	EX2(st	%g2, [%o1 - 0x04])
+	EX(st	%g2, [%o1 - 0x04])
 	addx	%g0, %g7, %g7
 	andcc	%o3, 3, %g0
 	add	%o0, 4, %o0
@@ -276,14 +253,14 @@ cc_end_cruft:
 	 subcc	%o3, 2, %o3
 	b	4f
 	 or	%g0, %g0, %o4
-2:	EX(lduh	[%o0 + 0x00], %o4, add %o3, 2)
+2:	EX(lduh	[%o0 + 0x00], %o4)
 	add	%o0, 2, %o0
-	EX2(sth	%o4, [%o1 + 0x00])
+	EX(sth	%o4, [%o1 + 0x00])
 	be	6f
 	 add	%o1, 2, %o1
 	sll	%o4, 16, %o4
-4:	EX(ldub	[%o0 + 0x00], %o5, add %g0, 1)
-	EX2(stb	%o5, [%o1 + 0x00])
+4:	EX(ldub	[%o0 + 0x00], %o5)
+	EX(stb	%o5, [%o1 + 0x00])
 	sll	%o5, 8, %o5
 	or	%o5, %o4, %o4
 6:	addcc	%o4, %g7, %g7
@@ -306,9 +283,9 @@ cc_dword_align:
 	 andcc	%o0, 0x2, %g0
 	be	1f
 	 andcc	%o0, 0x4, %g0
-	EX(lduh	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(lduh	[%o0 + 0x00], %g4)
 	sub	%g1, 2, %g1
-	EX2(sth	%g4, [%o1 + 0x00])
+	EX(sth	%g4, [%o1 + 0x00])
 	add	%o0, 2, %o0
 	sll	%g4, 16, %g4
 	addcc	%g4, %g7, %g7
@@ -322,9 +299,9 @@ cc_dword_align:
 	or	%g3, %g7, %g7
 1:	be	3f
 	 andcc	%g1, 0xffffff80, %g0
-	EX(ld	[%o0 + 0x00], %g4, add %g1, 0)
+	EX(ld	[%o0 + 0x00], %g4)
 	sub	%g1, 4, %g1
-	EX2(st	%g4, [%o1 + 0x00])
+	EX(st	%g4, [%o1 + 0x00])
 	add	%o0, 4, %o0
 	addcc	%g4, %g7, %g7
 	add	%o1, 4, %o1
@@ -354,7 +331,7 @@ __csum_partial_copy_sparc_generic:
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-10:	EXT(5b, 10b, 20f)		! note for exception handling
+10:	EXT(5b, 10b)			! note for exception handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -379,7 +356,7 @@ cctbl:	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x68,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x28,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x18,%g2,%g3,%g4,%g5)
 	CSUMCOPY_LASTCHUNK(%o0,%o1,%g7,0x08,%g2,%g3,%g4,%g5)
-12:	EXT(cctbl, 12b, 22f)		! note for exception table handling
+12:	EXT(cctbl, 12b)			! note for exception table handling
 	addx	%g0, %g7, %g7
 	andcc	%o3, 0xf, %g0		! check for low bits set
 ccte:	bne	cc_end_cruft		! something left, handle it out of band
@@ -390,7 +367,7 @@ ccdbl:	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x00,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x20,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x40,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
 	CSUMCOPY_BIGCHUNK_ALIGNED(%o0,%o1,%g7,0x60,%o4,%o5,%g2,%g3,%g4,%g5,%o2,%o3)
-11:	EXT(ccdbl, 11b, 21f)		! note for exception table handling
+11:	EXT(ccdbl, 11b)			! note for exception table handling
 	sub	%g1, 128, %g1		! detract from length
 	addx	%g0, %g7, %g7		! add in last carry bit
 	andcc	%g1, 0xffffff80, %g0	! more to csum?
@@ -407,9 +384,9 @@ ccslow:	cmp	%g1, 0
 	be,a	1f
 	 srl	%g1, 1, %g4		
 	sub	%g1, 1, %g1	
-	EX(ldub	[%o0], %g5, add %g1, 1)
+	EX(ldub	[%o0], %g5)
 	add	%o0, 1, %o0	
-	EX2(stb	%g5, [%o1])
+	EX(stb	%g5, [%o1])
 	srl	%g1, 1, %g4
 	add	%o1, 1, %o1
 1:	cmp	%g4, 0		
@@ -418,34 +395,34 @@ ccslow:	cmp	%g1, 0
 	andcc	%o0, 2, %g0	
 	be,a	1f
 	 srl	%g4, 1, %g4
-	EX(lduh	[%o0], %o4, add %g1, 0)
+	EX(lduh	[%o0], %o4)
 	sub	%g1, 2, %g1	
 	srl	%o4, 8, %g2
 	sub	%g4, 1, %g4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o0, 2, %o0	
 	srl	%g4, 1, %g4
 	add	%o1, 2, %o1
 1:	cmp	%g4, 0		
 	be,a	2f
 	 andcc	%g1, 2, %g0
-	EX3(ld	[%o0], %o4)
+	EX(ld	[%o0], %o4)
 5:	srl	%o4, 24, %g2
 	srl	%o4, 16, %g3
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	srl	%o4, 8, %g2
-	EX2(stb	%g3, [%o1 + 1])
+	EX(stb	%g3, [%o1 + 1])
 	add	%o0, 4, %o0
-	EX2(stb	%g2, [%o1 + 2])
+	EX(stb	%g2, [%o1 + 2])
 	addcc	%o4, %g5, %g5
-	EX2(stb	%o4, [%o1 + 3])
+	EX(stb	%o4, [%o1 + 3])
 	addx	%g5, %g0, %g5	! I am now to lazy to optimize this (question it
 	add	%o1, 4, %o1	! is worthy). Maybe some day - with the sll/srl
 	subcc	%g4, 1, %g4	! tricks
 	bne,a	5b
-	 EX3(ld	[%o0], %o4)
+	 EX(ld	[%o0], %o4)
 	sll	%g5, 16, %g2
 	srl	%g5, 16, %g5
 	srl	%g2, 16, %g2
@@ -453,19 +430,19 @@ ccslow:	cmp	%g1, 0
 	add	%g2, %g5, %g5 
 2:	be,a	3f		
 	 andcc	%g1, 1, %g0
-	EX(lduh	[%o0], %o4, and %g1, 3)
+	EX(lduh	[%o0], %o4)
 	andcc	%g1, 1, %g0
 	srl	%o4, 8, %g2
 	add	%o0, 2, %o0	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
-	EX2(stb	%o4, [%o1 + 1])
+	EX(stb	%o4, [%o1 + 1])
 	add	%o1, 2, %o1
 3:	be,a	1f		
 	 sll	%g5, 16, %o4
-	EX(ldub	[%o0], %g2, add %g0, 1)
+	EX(ldub	[%o0], %g2)
 	sll	%g2, 8, %o4	
-	EX2(stb	%g2, [%o1])
+	EX(stb	%g2, [%o1])
 	add	%g5, %o4, %g5
 	sll	%g5, 16, %o4
 1:	addcc	%o4, %g5, %g5
@@ -481,113 +458,10 @@ ccslow:	cmp	%g1, 0
 4:	addcc	%g7, %g5, %g7
 	retl	
 	 addx	%g0, %g7, %o0
-__csum_partial_copy_end:
 
 /* We do these strange calculations for the csum_*_from_user case only, ie.
  * we only bother with faults on loads... */
 
-/* o2 = ((g2%20)&3)*8
- * o3 = g1 - (g2/20)*32 - o2 */
-20:
-	cmp	%g2, 20
-	blu,a	1f
-	 and	%g2, 3, %o2
-	sub	%g1, 32, %g1
-	b	20b
-	 sub	%g2, 20, %g2
-1:
-	sll	%o2, 3, %o2
-	b	31f
-	 sub	%g1, %o2, %o3
-
-/* o2 = (!(g2 & 15) ? 0 : (((g2 & 15) + 1) & ~1)*8)
- * o3 = g1 - (g2/16)*32 - o2 */
-21:
-	andcc	%g2, 15, %o3
-	srl	%g2, 4, %g2
-	be,a	1f
-	 clr	%o2
-	add	%o3, 1, %o3
-	and	%o3, 14, %o3
-	sll	%o3, 3, %o2
-1:
-	sll	%g2, 5, %g2
-	sub	%g1, %g2, %o3
-	b	31f
-	 sub	%o3, %o2, %o3
-
-/* o0 += (g2/10)*16 - 0x70
- * 01 += (g2/10)*16 - 0x70
- * o2 = (g2 % 10) ? 8 : 0
- * o3 += 0x70 - (g2/10)*16 - o2 */
-22:
-	cmp	%g2, 10
-	blu,a	1f
-	 sub	%o0, 0x70, %o0
-	add	%o0, 16, %o0
-	add	%o1, 16, %o1
-	sub	%o3, 16, %o3
-	b	22b
-	 sub	%g2, 10, %g2
-1:
-	sub	%o1, 0x70, %o1
-	add	%o3, 0x70, %o3
-	clr	%o2
-	tst	%g2
-	bne,a	1f
-	 mov	8, %o2
-1:
-	b	31f
-	 sub	%o3, %o2, %o3
-96:
-	and	%g1, 3, %g1
-	sll	%g4, 2, %g4
-	add	%g1, %g4, %o3
-30:
-/* %o1 is dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	clr	%o2
-31:
-/* %o0 is src
- * %o1 is dst
- * %o2 is # of bytes to copy from src to dst
- * %o3 is # bytes to zero out
- * %o4 is faulting address
- * %o5 is %pc where fault occurred */
-	save	%sp, -104, %sp
-        mov     %i5, %o0
-        mov     %i7, %o1
-        mov	%i4, %o2
-        call    lookup_fault
-	 mov	%g7, %i4
-	cmp	%o0, 2
-	bne	1f	
-	 add	%g0, -EFAULT, %i5
-	tst	%i2
-	be	2f
-	 mov	%i0, %o1
-	mov	%i1, %o0
-5:
-	call	memcpy
-	 mov	%i2, %o2
-	tst	%o0
-	bne,a	2f
-	 add	%i3, %i2, %i3
-	add	%i1, %i2, %i1
-2:
-	mov	%i1, %o0
-6:
-	call	__bzero
-	 mov	%i3, %o1
-1:
-	ld	[%sp + 168], %o2		! struct_ptr of parent
-	st	%i5, [%o2]
+cc_fault:
 	ret
-	 restore
-
-        .section __ex_table,#alloc
-        .align 4
-        .word 5b,2
-	.word 6b,2
+	 clr	%o0
diff --git a/arch/sparc/mm/fault_32.c b/arch/sparc/mm/fault_32.c
index cfef656eda0f..1185b6169144 100644
--- a/arch/sparc/mm/fault_32.c
+++ b/arch/sparc/mm/fault_32.c
@@ -297,8 +297,6 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 		if (fixup > 10) {
 			extern const unsigned int __memset_start[];
 			extern const unsigned int __memset_end[];
-			extern const unsigned int __csum_partial_copy_start[];
-			extern const unsigned int __csum_partial_copy_end[];
 
 #ifdef DEBUG_EXCEPTIONS
 			printk("Exception: PC<%08lx> faddr<%08lx>\n",
@@ -307,9 +305,7 @@ asmlinkage void do_sparc_fault(struct pt_regs *regs, int text_fault, int write,
 				regs->pc, fixup, g2);
 #endif
 			if ((regs->pc >= (unsigned long)__memset_start &&
-			     regs->pc < (unsigned long)__memset_end) ||
-			    (regs->pc >= (unsigned long)__csum_partial_copy_start &&
-			     regs->pc < (unsigned long)__csum_partial_copy_end)) {
+			     regs->pc < (unsigned long)__memset_end)) {
 				regs->u_regs[UREG_I4] = address;
 				regs->u_regs[UREG_I5] = regs->pc;
 			}
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 14/20] mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (11 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 13/20] sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 15/20] mips: __csum_partial_copy_kernel() has no users left Al Viro
                       ` (5 subsequent siblings)
  18 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

they are only called for iovec-backed iov_iter and under KERNEL_DS an
attempt to create such a beast will yield a kvec-backed one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h | 32 +++++++-------------------------
 1 file changed, 7 insertions(+), 25 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index b882cacea3ee..5cf4ce11c821 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -41,22 +41,6 @@ __wsum __csum_partial_copy_from_user(const void *src, void *dst,
 				     int len, __wsum sum, int *err_ptr);
 __wsum __csum_partial_copy_to_user(const void *src, void *dst,
 				   int len, __wsum sum, int *err_ptr);
-/*
- * this is a new version of the above that records errors it finds in *errp,
- * but continues and zeros the rest of the buffer.
- */
-static inline
-__wsum csum_partial_copy_from_user(const void __user *src, void *dst, int len,
-				   __wsum sum, int *err_ptr)
-{
-	might_fault();
-	if (uaccess_kernel())
-		return __csum_partial_copy_kernel((__force void *)src, dst,
-						  len, sum, err_ptr);
-	else
-		return __csum_partial_copy_from_user((__force void *)src, dst,
-						     len, sum, err_ptr);
-}
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
@@ -65,9 +49,12 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 	__wsum sum = ~0U;
 	int err = 0;
 
+	might_fault();
+
 	if (!access_ok(src, len))
 		return 0;
-	sum = csum_partial_copy_from_user(src, dst, len, sum, &err);
+	sum = __csum_partial_copy_from_user((__force void *)src, dst,
+						     len, sum, &err);
 	return err ? 0 : sum;
 }
 
@@ -84,14 +71,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 	might_fault();
 	if (!access_ok(dst, len))
 		return 0;
-	if (uaccess_kernel())
-		sum = __csum_partial_copy_kernel(src,
-						  (__force void *)dst,
-						  len, sum, &err);
-	else
-		sum = __csum_partial_copy_to_user(src,
-						   (__force void *)dst,
-						   len, sum, &err);
+	sum = __csum_partial_copy_to_user(src,
+					   (__force void *)dst,
+					   len, sum, &err);
 	return err ? 0 : sum;
 }
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 15/20] mips: __csum_partial_copy_kernel() has no users left
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (12 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 14/20] mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 16/20] mips: propagate the calling convention change down into __csum_partial_copy_..._user() Al Viro
                       ` (4 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h | 3 ---
 arch/mips/lib/csum_partial.S     | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 5cf4ce11c821..a8ff9c306363 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,9 +34,6 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_kernel(const void *src, void *dst,
-				  int len, __wsum sum, int *err_ptr);
-
 __wsum __csum_partial_copy_from_user(const void *src, void *dst,
 				     int len, __wsum sum, int *err_ptr);
 __wsum __csum_partial_copy_to_user(const void *src, void *dst,
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 8d70855b0914..983e909c2052 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -827,8 +827,6 @@ EXPORT_SYMBOL(csum_partial)
 	.set	pop
 	.endm
 
-LEAF(__csum_partial_copy_kernel)
-EXPORT_SYMBOL(__csum_partial_copy_kernel)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
@@ -836,7 +834,6 @@ FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
 __BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
-END(__csum_partial_copy_kernel)
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 15/20] mips: __csum_partial_copy_kernel() has no users left
  2020-07-24  1:25     ` [PATCH v2 15/20] mips: __csum_partial_copy_kernel() has no users left Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h | 3 ---
 arch/mips/lib/csum_partial.S     | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 5cf4ce11c821..a8ff9c306363 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,9 +34,6 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_kernel(const void *src, void *dst,
-				  int len, __wsum sum, int *err_ptr);
-
 __wsum __csum_partial_copy_from_user(const void *src, void *dst,
 				     int len, __wsum sum, int *err_ptr);
 __wsum __csum_partial_copy_to_user(const void *src, void *dst,
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 8d70855b0914..983e909c2052 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -827,8 +827,6 @@ EXPORT_SYMBOL(csum_partial)
 	.set	pop
 	.endm
 
-LEAF(__csum_partial_copy_kernel)
-EXPORT_SYMBOL(__csum_partial_copy_kernel)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
@@ -836,7 +834,6 @@ FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
 __BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
-END(__csum_partial_copy_kernel)
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 16/20] mips: propagate the calling convention change down into __csum_partial_copy_..._user()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (13 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 15/20] mips: __csum_partial_copy_kernel() has no users left Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 17/20] xtensa: propagate the calling conventions change down into csum_partial_copy_generic() Al Viro
                       ` (3 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

and turn the exception handlers into simply returning 0, which
simplifies the hell out of things in csum_partial.S

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h |  26 +---
 arch/mips/lib/csum_partial.S     | 258 +++++++++++++--------------------------
 2 files changed, 89 insertions(+), 195 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index a8ff9c306363..c6c682519d94 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,25 +34,17 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_from_user(const void *src, void *dst,
-				     int len, __wsum sum, int *err_ptr);
-__wsum __csum_partial_copy_to_user(const void *src, void *dst,
-				   int len, __wsum sum, int *err_ptr);
+__wsum __csum_partial_copy_from_user(const void __user *src, void *dst, int len);
+__wsum __csum_partial_copy_to_user(const void *src, void __user *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	__wsum sum = ~0U;
-	int err = 0;
-
 	might_fault();
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = __csum_partial_copy_from_user((__force void *)src, dst,
-						     len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_from_user(src, dst, len);
 }
 
 /*
@@ -62,16 +54,10 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	might_fault();
 	if (!access_ok(dst, len))
 		return 0;
-	sum = __csum_partial_copy_to_user(src,
-					   (__force void *)dst,
-					   len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_to_user(src, dst, len);
 }
 
 /*
@@ -79,10 +65,10 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
  * we have just one address space, so this is identical to the above)
  */
 #define _HAVE_ARCH_CSUM_AND_COPY
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len);
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
+	return __csum_partial_copy_nocheck(src, dst, len);
 }
 
 /*
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 983e909c2052..a46db0807195 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -308,8 +308,8 @@ EXPORT_SYMBOL(csum_partial)
 /*
  * checksum and copy routines based on memcpy.S
  *
- *	csum_partial_copy_nocheck(src, dst, len, sum)
- *	__csum_partial_copy_kernel(src, dst, len, sum, errp)
+ *	csum_partial_copy_nocheck(src, dst, len)
+ *	__csum_partial_copy_kernel(src, dst, len)
  *
  * See "Spec" in memcpy.S for details.	Unlike __copy_user, all
  * function in this file use the standard calling convention.
@@ -318,26 +318,11 @@ EXPORT_SYMBOL(csum_partial)
 #define src a0
 #define dst a1
 #define len a2
-#define psum a3
 #define sum v0
 #define odd t8
-#define errptr t9
 
 /*
- * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
- *     of the copy,
- *  2- src_entry <= src < AT, and
- *  3- (dst - src) == (dst_entry - src_entry),
- * The _entry suffix denotes values when __copy_user was called.
- *
- * (1) is set up up by __csum_partial_copy_from_user and maintained by
- *	not writing AT in __csum_partial_copy
- * (2) is met by incrementing src by the number of bytes copied
- * (3) is met by not doing loads between a pair of increments of dst and src
- *
- * The exception handlers for stores stores -EFAULT to errptr and return.
- * These handlers do not need to overwrite any data.
+ * All exception handlers simply return 0.
  */
 
 /* Instruction type */
@@ -358,11 +343,11 @@ EXPORT_SYMBOL(csum_partial)
  * addr    : Address
  * handler : Exception handler
  */
-#define EXC(insn, type, reg, addr, handler)	\
+#define EXC(insn, type, reg, addr)		\
 	.if \mode == LEGACY_MODE;		\
 9:		insn reg, addr;			\
 		.section __ex_table,"a";	\
-		PTR	9b, handler;		\
+		PTR	9b, .L_exc;		\
 		.previous;			\
 	/* This is enabled in EVA mode */	\
 	.else;					\
@@ -371,7 +356,7 @@ EXPORT_SYMBOL(csum_partial)
 		    ((\to == USEROP) && (type == ST_INSN));	\
 9:			__BUILD_EVA_INSN(insn##e, reg, addr);	\
 			.section __ex_table,"a";		\
-			PTR	9b, handler;			\
+			PTR	9b, .L_exc;			\
 			.previous;				\
 		.else;						\
 			/* EVA without exception */		\
@@ -384,14 +369,14 @@ EXPORT_SYMBOL(csum_partial)
 #ifdef USE_DOUBLE
 
 #define LOADK	ld /* No exception */
-#define LOAD(reg, addr, handler)	EXC(ld, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(ldl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(ldr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(sdl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(sdr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sd, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(ld, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(ldl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(ldr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(sdl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(sdr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sd, ST_INSN, reg, addr)
 #define ADD    daddu
 #define SUB    dsubu
 #define SRL    dsrl
@@ -404,14 +389,14 @@ EXPORT_SYMBOL(csum_partial)
 #else
 
 #define LOADK	lw /* No exception */
-#define LOAD(reg, addr, handler)	EXC(lw, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(lwl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(lwr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(swl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(swr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sw, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(lw, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(lwl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(lwr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(swl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(swr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sw, ST_INSN, reg, addr)
 #define ADD    addu
 #define SUB    subu
 #define SRL    srl
@@ -450,22 +435,9 @@ EXPORT_SYMBOL(csum_partial)
 	.set	at=v1
 #endif
 
-	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to, __nocheck
+	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to
 
-	PTR_ADDU	AT, src, len	/* See (1) above. */
-	/* initialize __nocheck if this the first time we execute this
-	 * macro
-	 */
-#ifdef CONFIG_64BIT
-	move	errptr, a4
-#else
-	lw	errptr, 16(sp)
-#endif
-	.if \__nocheck == 1
-	FEXPORT(__csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
-	.endif
-	move	sum, zero
+	li	sum, -1
 	move	odd, zero
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
@@ -497,31 +469,31 @@ EXPORT_SYMBOL(csum_partial)
 	SUB	len, 8*NBYTES		# subtract here for bgez loop
 	.align	4
 1:
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
-	LOAD(t4, UNIT(4)(src), .Ll_exc_copy\@)
-	LOAD(t5, UNIT(5)(src), .Ll_exc_copy\@)
-	LOAD(t6, UNIT(6)(src), .Ll_exc_copy\@)
-	LOAD(t7, UNIT(7)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
+	LOAD(t4, UNIT(4)(src))
+	LOAD(t5, UNIT(5)(src))
+	LOAD(t6, UNIT(6)(src))
+	LOAD(t7, UNIT(7)(src))
 	SUB	len, len, 8*NBYTES
 	ADD	src, src, 8*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
-	STORE(t4, UNIT(4)(dst),	.Ls_exc\@)
+	STORE(t4, UNIT(4)(dst))
 	ADDC(t4, t5)
-	STORE(t5, UNIT(5)(dst),	.Ls_exc\@)
+	STORE(t5, UNIT(5)(dst))
 	ADDC(sum, t4)
-	STORE(t6, UNIT(6)(dst),	.Ls_exc\@)
+	STORE(t6, UNIT(6)(dst))
 	ADDC(t6, t7)
-	STORE(t7, UNIT(7)(dst),	.Ls_exc\@)
+	STORE(t7, UNIT(7)(dst))
 	ADDC(sum, t6)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 8*NBYTES
@@ -541,19 +513,19 @@ EXPORT_SYMBOL(csum_partial)
 	/*
 	 * len >= 4*NBYTES
 	 */
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
 	SUB	len, len, 4*NBYTES
 	ADD	src, src, 4*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -566,10 +538,10 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -592,10 +564,10 @@ EXPORT_SYMBOL(csum_partial)
 	 ADD	t1, dst, len	# t1 is just past last byte of dst
 	li	bits, 8*NBYTES
 	SLL	rem, len, 3	# rem = number of bits to keep
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	SUB	bits, bits, rem # bits = number of bits to discard
 	SHIFT_DISCARD t0, t0, bits
-	STREST(t0, -1(t1), .Ls_exc\@)
+	STREST(t0, -1(t1))
 	SHIFT_DISCARD_REVERT t0, t0, bits
 	.set reorder
 	ADDC(sum, t0)
@@ -612,12 +584,12 @@ EXPORT_SYMBOL(csum_partial)
 	 * Set match = (src and dst have same alignment)
 	 */
 #define match rem
-	LDFIRST(t3, FIRST(0)(src), .Ll_exc\@)
+	LDFIRST(t3, FIRST(0)(src))
 	ADD	t2, zero, NBYTES
-	LDREST(t3, REST(0)(src), .Ll_exc_copy\@)
+	LDREST(t3, REST(0)(src))
 	SUB	t2, t2, t1	# t2 = number of bytes copied
 	xor	match, t0, t1
-	STFIRST(t3, FIRST(0)(dst), .Ls_exc\@)
+	STFIRST(t3, FIRST(0)(dst))
 	SLL	t4, t1, 3		# t4 = number of bits to discard
 	SHIFT_DISCARD t3, t3, t4
 	/* no SHIFT_DISCARD_REVERT to handle odd buffer properly */
@@ -639,26 +611,26 @@ EXPORT_SYMBOL(csum_partial)
  * It's OK to load FIRST(N+1) before REST(N) because the two addresses
  * are to the same unit (unless src is aligned, but it's not).
  */
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDFIRST(t1, FIRST(1)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDFIRST(t1, FIRST(1)(src))
 	SUB	len, len, 4*NBYTES
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
-	LDREST(t1, REST(1)(src), .Ll_exc_copy\@)
-	LDFIRST(t2, FIRST(2)(src), .Ll_exc_copy\@)
-	LDFIRST(t3, FIRST(3)(src), .Ll_exc_copy\@)
-	LDREST(t2, REST(2)(src), .Ll_exc_copy\@)
-	LDREST(t3, REST(3)(src), .Ll_exc_copy\@)
+	LDREST(t0, REST(0)(src))
+	LDREST(t1, REST(1)(src))
+	LDFIRST(t2, FIRST(2)(src))
+	LDFIRST(t3, FIRST(3)(src))
+	LDREST(t2, REST(2)(src))
+	LDREST(t3, REST(3)(src))
 	ADD	src, src, 4*NBYTES
 #ifdef CONFIG_CPU_SB1
 	nop				# improves slotting
 #endif
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -671,11 +643,11 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDREST(t0, REST(0)(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -696,11 +668,10 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	move	t2, zero	# partial word
 	li	t3, SHIFT_START # shift
-/* use .Ll_exc_copy here to return correct sum on fault */
 #define COPY_BYTE(N)			\
-	LOADBU(t0, N(src), .Ll_exc_copy\@);	\
+	LOADBU(t0, N(src));		\
 	SUB	len, len, 1;		\
-	STOREB(t0, N(dst), .Ls_exc\@);	\
+	STOREB(t0, N(dst));		\
 	SLLV	t0, t0, t3;		\
 	addu	t3, SHIFT_INC;		\
 	beqz	len, .Lcopy_bytes_done\@; \
@@ -714,9 +685,9 @@ EXPORT_SYMBOL(csum_partial)
 	COPY_BYTE(4)
 	COPY_BYTE(5)
 #endif
-	LOADBU(t0, NBYTES-2(src), .Ll_exc_copy\@)
+	LOADBU(t0, NBYTES-2(src))
 	SUB	len, len, 1
-	STOREB(t0, NBYTES-2(dst), .Ls_exc\@)
+	STOREB(t0, NBYTES-2(dst))
 	SLLV	t0, t0, t3
 	or	t2, t0
 .Lcopy_bytes_done\@:
@@ -753,94 +724,31 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	.set	pop
 	.set reorder
-	ADDC32(sum, psum)
 	jr	ra
 	.set noreorder
+	.endm
 
-.Ll_exc_copy\@:
-	/*
-	 * Copy bytes from src until faulting load address (or until a
-	 * lb faults)
-	 *
-	 * When reached by a faulting LDFIRST/LDREST, THREAD_BUADDR($28)
-	 * may be more than a byte beyond the last address.
-	 * Hence, the lb below may get an exception.
-	 *
-	 * Assumes src < THREAD_BUADDR($28)
-	 */
-	LOADK	t0, TI_TASK($28)
-	 li	t2, SHIFT_START
-	LOADK	t0, THREAD_BUADDR(t0)
-1:
-	LOADBU(t1, 0(src), .Ll_exc\@)
-	ADD	src, src, 1
-	sb	t1, 0(dst)	# can't fault -- we're copy_from_user
-	SLLV	t1, t1, t2
-	addu	t2, SHIFT_INC
-	ADDC(sum, t1)
-	.set	reorder				/* DADDI_WAR */
-	ADD	dst, dst, 1
-	bne	src, t0, 1b
-	.set	noreorder
-.Ll_exc\@:
-	LOADK	t0, TI_TASK($28)
-	 nop
-	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
-	 nop
-	SUB	len, AT, t0		# len number of uncopied bytes
-	/*
-	 * Here's where we rely on src and dst being incremented in tandem,
-	 *   See (3) above.
-	 * dst += (fault addr - src) to put dst at first byte to clear
-	 */
-	ADD	dst, t0			# compute start address in a1
-	SUB	dst, src
-	/*
-	 * Clear len bytes starting at dst.  Can't call __bzero because it
-	 * might modify len.  An inefficient loop for these rare times...
-	 */
-	.set	reorder				/* DADDI_WAR */
-	SUB	src, len, 1
-	beqz	len, .Ldone\@
-	.set	noreorder
-1:	sb	zero, 0(dst)
-	ADD	dst, dst, 1
-	.set	push
-	.set	noat
-#ifndef CONFIG_CPU_DADDI_WORKAROUNDS
-	bnez	src, 1b
-	 SUB	src, src, 1
-#else
-	li	v1, 1
-	bnez	src, 1b
-	 SUB	src, src, v1
-#endif
-	li	v1, -EFAULT
-	b	.Ldone\@
-	 sw	v1, (errptr)
-
-.Ls_exc\@:
-	li	v0, -1 /* invalid checksum */
-	li	v1, -EFAULT
+	.set noreorder
+.L_exc:
 	jr	ra
-	 sw	v1, (errptr)
-	.set	pop
-	.endm
+	 li	v0, 0
 
+FEXPORT(__csum_partial_copy_nocheck)
+EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
 FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
-__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
+__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP
 END(__csum_partial_copy_to_user)
 
 LEAF(__csum_partial_copy_from_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP
 END(__csum_partial_copy_from_user)
 #endif
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 16/20] mips: propagate the calling convention change down into __csum_partial_copy_..._user()
  2020-07-24  1:25     ` [PATCH v2 16/20] mips: propagate the calling convention change down into __csum_partial_copy_..._user() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

and turn the exception handlers into simply returning 0, which
simplifies the hell out of things in csum_partial.S

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/mips/include/asm/checksum.h |  26 +---
 arch/mips/lib/csum_partial.S     | 258 +++++++++++++--------------------------
 2 files changed, 89 insertions(+), 195 deletions(-)

diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index a8ff9c306363..c6c682519d94 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -34,25 +34,17 @@
  */
 __wsum csum_partial(const void *buff, int len, __wsum sum);
 
-__wsum __csum_partial_copy_from_user(const void *src, void *dst,
-				     int len, __wsum sum, int *err_ptr);
-__wsum __csum_partial_copy_to_user(const void *src, void *dst,
-				   int len, __wsum sum, int *err_ptr);
+__wsum __csum_partial_copy_from_user(const void __user *src, void *dst, int len);
+__wsum __csum_partial_copy_to_user(const void *src, void __user *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	__wsum sum = ~0U;
-	int err = 0;
-
 	might_fault();
-
 	if (!access_ok(src, len))
 		return 0;
-	sum = __csum_partial_copy_from_user((__force void *)src, dst,
-						     len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_from_user(src, dst, len);
 }
 
 /*
@@ -62,16 +54,10 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len)
 static inline
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	might_fault();
 	if (!access_ok(dst, len))
 		return 0;
-	sum = __csum_partial_copy_to_user(src,
-					   (__force void *)dst,
-					   len, sum, &err);
-	return err ? 0 : sum;
+	return __csum_partial_copy_to_user(src, dst, len);
 }
 
 /*
@@ -79,10 +65,10 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
  * we have just one address space, so this is identical to the above)
  */
 #define _HAVE_ARCH_CSUM_AND_COPY
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
+__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len);
 static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
+	return __csum_partial_copy_nocheck(src, dst, len);
 }
 
 /*
diff --git a/arch/mips/lib/csum_partial.S b/arch/mips/lib/csum_partial.S
index 983e909c2052..a46db0807195 100644
--- a/arch/mips/lib/csum_partial.S
+++ b/arch/mips/lib/csum_partial.S
@@ -308,8 +308,8 @@ EXPORT_SYMBOL(csum_partial)
 /*
  * checksum and copy routines based on memcpy.S
  *
- *	csum_partial_copy_nocheck(src, dst, len, sum)
- *	__csum_partial_copy_kernel(src, dst, len, sum, errp)
+ *	csum_partial_copy_nocheck(src, dst, len)
+ *	__csum_partial_copy_kernel(src, dst, len)
  *
  * See "Spec" in memcpy.S for details.	Unlike __copy_user, all
  * function in this file use the standard calling convention.
@@ -318,26 +318,11 @@ EXPORT_SYMBOL(csum_partial)
 #define src a0
 #define dst a1
 #define len a2
-#define psum a3
 #define sum v0
 #define odd t8
-#define errptr t9
 
 /*
- * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
- *     of the copy,
- *  2- src_entry <= src < AT, and
- *  3- (dst - src) == (dst_entry - src_entry),
- * The _entry suffix denotes values when __copy_user was called.
- *
- * (1) is set up up by __csum_partial_copy_from_user and maintained by
- *	not writing AT in __csum_partial_copy
- * (2) is met by incrementing src by the number of bytes copied
- * (3) is met by not doing loads between a pair of increments of dst and src
- *
- * The exception handlers for stores stores -EFAULT to errptr and return.
- * These handlers do not need to overwrite any data.
+ * All exception handlers simply return 0.
  */
 
 /* Instruction type */
@@ -358,11 +343,11 @@ EXPORT_SYMBOL(csum_partial)
  * addr    : Address
  * handler : Exception handler
  */
-#define EXC(insn, type, reg, addr, handler)	\
+#define EXC(insn, type, reg, addr)		\
 	.if \mode == LEGACY_MODE;		\
 9:		insn reg, addr;			\
 		.section __ex_table,"a";	\
-		PTR	9b, handler;		\
+		PTR	9b, .L_exc;		\
 		.previous;			\
 	/* This is enabled in EVA mode */	\
 	.else;					\
@@ -371,7 +356,7 @@ EXPORT_SYMBOL(csum_partial)
 		    ((\to == USEROP) && (type == ST_INSN));	\
 9:			__BUILD_EVA_INSN(insn##e, reg, addr);	\
 			.section __ex_table,"a";		\
-			PTR	9b, handler;			\
+			PTR	9b, .L_exc;			\
 			.previous;				\
 		.else;						\
 			/* EVA without exception */		\
@@ -384,14 +369,14 @@ EXPORT_SYMBOL(csum_partial)
 #ifdef USE_DOUBLE
 
 #define LOADK	ld /* No exception */
-#define LOAD(reg, addr, handler)	EXC(ld, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(ldl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(ldr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(sdl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(sdr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sd, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(ld, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(ldl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(ldr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(sdl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(sdr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sd, ST_INSN, reg, addr)
 #define ADD    daddu
 #define SUB    dsubu
 #define SRL    dsrl
@@ -404,14 +389,14 @@ EXPORT_SYMBOL(csum_partial)
 #else
 
 #define LOADK	lw /* No exception */
-#define LOAD(reg, addr, handler)	EXC(lw, LD_INSN, reg, addr, handler)
-#define LOADBU(reg, addr, handler)	EXC(lbu, LD_INSN, reg, addr, handler)
-#define LOADL(reg, addr, handler)	EXC(lwl, LD_INSN, reg, addr, handler)
-#define LOADR(reg, addr, handler)	EXC(lwr, LD_INSN, reg, addr, handler)
-#define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
-#define STOREL(reg, addr, handler)	EXC(swl, ST_INSN, reg, addr, handler)
-#define STORER(reg, addr, handler)	EXC(swr, ST_INSN, reg, addr, handler)
-#define STORE(reg, addr, handler)	EXC(sw, ST_INSN, reg, addr, handler)
+#define LOAD(reg, addr)		EXC(lw, LD_INSN, reg, addr)
+#define LOADBU(reg, addr)	EXC(lbu, LD_INSN, reg, addr)
+#define LOADL(reg, addr)	EXC(lwl, LD_INSN, reg, addr)
+#define LOADR(reg, addr)	EXC(lwr, LD_INSN, reg, addr)
+#define STOREB(reg, addr)	EXC(sb, ST_INSN, reg, addr)
+#define STOREL(reg, addr)	EXC(swl, ST_INSN, reg, addr)
+#define STORER(reg, addr)	EXC(swr, ST_INSN, reg, addr)
+#define STORE(reg, addr)	EXC(sw, ST_INSN, reg, addr)
 #define ADD    addu
 #define SUB    subu
 #define SRL    srl
@@ -450,22 +435,9 @@ EXPORT_SYMBOL(csum_partial)
 	.set	at=v1
 #endif
 
-	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to, __nocheck
+	.macro __BUILD_CSUM_PARTIAL_COPY_USER mode, from, to
 
-	PTR_ADDU	AT, src, len	/* See (1) above. */
-	/* initialize __nocheck if this the first time we execute this
-	 * macro
-	 */
-#ifdef CONFIG_64BIT
-	move	errptr, a4
-#else
-	lw	errptr, 16(sp)
-#endif
-	.if \__nocheck == 1
-	FEXPORT(__csum_partial_copy_nocheck)
-	EXPORT_SYMBOL(__csum_partial_copy_nocheck)
-	.endif
-	move	sum, zero
+	li	sum, -1
 	move	odd, zero
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
@@ -497,31 +469,31 @@ EXPORT_SYMBOL(csum_partial)
 	SUB	len, 8*NBYTES		# subtract here for bgez loop
 	.align	4
 1:
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
-	LOAD(t4, UNIT(4)(src), .Ll_exc_copy\@)
-	LOAD(t5, UNIT(5)(src), .Ll_exc_copy\@)
-	LOAD(t6, UNIT(6)(src), .Ll_exc_copy\@)
-	LOAD(t7, UNIT(7)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
+	LOAD(t4, UNIT(4)(src))
+	LOAD(t5, UNIT(5)(src))
+	LOAD(t6, UNIT(6)(src))
+	LOAD(t7, UNIT(7)(src))
 	SUB	len, len, 8*NBYTES
 	ADD	src, src, 8*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
-	STORE(t4, UNIT(4)(dst),	.Ls_exc\@)
+	STORE(t4, UNIT(4)(dst))
 	ADDC(t4, t5)
-	STORE(t5, UNIT(5)(dst),	.Ls_exc\@)
+	STORE(t5, UNIT(5)(dst))
 	ADDC(sum, t4)
-	STORE(t6, UNIT(6)(dst),	.Ls_exc\@)
+	STORE(t6, UNIT(6)(dst))
 	ADDC(t6, t7)
-	STORE(t7, UNIT(7)(dst),	.Ls_exc\@)
+	STORE(t7, UNIT(7)(dst))
 	ADDC(sum, t6)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 8*NBYTES
@@ -541,19 +513,19 @@ EXPORT_SYMBOL(csum_partial)
 	/*
 	 * len >= 4*NBYTES
 	 */
-	LOAD(t0, UNIT(0)(src), .Ll_exc\@)
-	LOAD(t1, UNIT(1)(src), .Ll_exc_copy\@)
-	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
-	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
+	LOAD(t0, UNIT(0)(src))
+	LOAD(t1, UNIT(1)(src))
+	LOAD(t2, UNIT(2)(src))
+	LOAD(t3, UNIT(3)(src))
 	SUB	len, len, 4*NBYTES
 	ADD	src, src, 4*NBYTES
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -566,10 +538,10 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -592,10 +564,10 @@ EXPORT_SYMBOL(csum_partial)
 	 ADD	t1, dst, len	# t1 is just past last byte of dst
 	li	bits, 8*NBYTES
 	SLL	rem, len, 3	# rem = number of bits to keep
-	LOAD(t0, 0(src), .Ll_exc\@)
+	LOAD(t0, 0(src))
 	SUB	bits, bits, rem # bits = number of bits to discard
 	SHIFT_DISCARD t0, t0, bits
-	STREST(t0, -1(t1), .Ls_exc\@)
+	STREST(t0, -1(t1))
 	SHIFT_DISCARD_REVERT t0, t0, bits
 	.set reorder
 	ADDC(sum, t0)
@@ -612,12 +584,12 @@ EXPORT_SYMBOL(csum_partial)
 	 * Set match = (src and dst have same alignment)
 	 */
 #define match rem
-	LDFIRST(t3, FIRST(0)(src), .Ll_exc\@)
+	LDFIRST(t3, FIRST(0)(src))
 	ADD	t2, zero, NBYTES
-	LDREST(t3, REST(0)(src), .Ll_exc_copy\@)
+	LDREST(t3, REST(0)(src))
 	SUB	t2, t2, t1	# t2 = number of bytes copied
 	xor	match, t0, t1
-	STFIRST(t3, FIRST(0)(dst), .Ls_exc\@)
+	STFIRST(t3, FIRST(0)(dst))
 	SLL	t4, t1, 3		# t4 = number of bits to discard
 	SHIFT_DISCARD t3, t3, t4
 	/* no SHIFT_DISCARD_REVERT to handle odd buffer properly */
@@ -639,26 +611,26 @@ EXPORT_SYMBOL(csum_partial)
  * It's OK to load FIRST(N+1) before REST(N) because the two addresses
  * are to the same unit (unless src is aligned, but it's not).
  */
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDFIRST(t1, FIRST(1)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDFIRST(t1, FIRST(1)(src))
 	SUB	len, len, 4*NBYTES
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
-	LDREST(t1, REST(1)(src), .Ll_exc_copy\@)
-	LDFIRST(t2, FIRST(2)(src), .Ll_exc_copy\@)
-	LDFIRST(t3, FIRST(3)(src), .Ll_exc_copy\@)
-	LDREST(t2, REST(2)(src), .Ll_exc_copy\@)
-	LDREST(t3, REST(3)(src), .Ll_exc_copy\@)
+	LDREST(t0, REST(0)(src))
+	LDREST(t1, REST(1)(src))
+	LDFIRST(t2, FIRST(2)(src))
+	LDFIRST(t3, FIRST(3)(src))
+	LDREST(t2, REST(2)(src))
+	LDREST(t3, REST(3)(src))
 	ADD	src, src, 4*NBYTES
 #ifdef CONFIG_CPU_SB1
 	nop				# improves slotting
 #endif
-	STORE(t0, UNIT(0)(dst),	.Ls_exc\@)
+	STORE(t0, UNIT(0)(dst))
 	ADDC(t0, t1)
-	STORE(t1, UNIT(1)(dst),	.Ls_exc\@)
+	STORE(t1, UNIT(1)(dst))
 	ADDC(sum, t0)
-	STORE(t2, UNIT(2)(dst),	.Ls_exc\@)
+	STORE(t2, UNIT(2)(dst))
 	ADDC(t2, t3)
-	STORE(t3, UNIT(3)(dst),	.Ls_exc\@)
+	STORE(t3, UNIT(3)(dst))
 	ADDC(sum, t2)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, 4*NBYTES
@@ -671,11 +643,11 @@ EXPORT_SYMBOL(csum_partial)
 	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
-	LDFIRST(t0, FIRST(0)(src), .Ll_exc\@)
-	LDREST(t0, REST(0)(src), .Ll_exc_copy\@)
+	LDFIRST(t0, FIRST(0)(src))
+	LDREST(t0, REST(0)(src))
 	ADD	src, src, NBYTES
 	SUB	len, len, NBYTES
-	STORE(t0, 0(dst), .Ls_exc\@)
+	STORE(t0, 0(dst))
 	ADDC(sum, t0)
 	.set	reorder				/* DADDI_WAR */
 	ADD	dst, dst, NBYTES
@@ -696,11 +668,10 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	move	t2, zero	# partial word
 	li	t3, SHIFT_START # shift
-/* use .Ll_exc_copy here to return correct sum on fault */
 #define COPY_BYTE(N)			\
-	LOADBU(t0, N(src), .Ll_exc_copy\@);	\
+	LOADBU(t0, N(src));		\
 	SUB	len, len, 1;		\
-	STOREB(t0, N(dst), .Ls_exc\@);	\
+	STOREB(t0, N(dst));		\
 	SLLV	t0, t0, t3;		\
 	addu	t3, SHIFT_INC;		\
 	beqz	len, .Lcopy_bytes_done\@; \
@@ -714,9 +685,9 @@ EXPORT_SYMBOL(csum_partial)
 	COPY_BYTE(4)
 	COPY_BYTE(5)
 #endif
-	LOADBU(t0, NBYTES-2(src), .Ll_exc_copy\@)
+	LOADBU(t0, NBYTES-2(src))
 	SUB	len, len, 1
-	STOREB(t0, NBYTES-2(dst), .Ls_exc\@)
+	STOREB(t0, NBYTES-2(dst))
 	SLLV	t0, t0, t3
 	or	t2, t0
 .Lcopy_bytes_done\@:
@@ -753,94 +724,31 @@ EXPORT_SYMBOL(csum_partial)
 #endif
 	.set	pop
 	.set reorder
-	ADDC32(sum, psum)
 	jr	ra
 	.set noreorder
+	.endm
 
-.Ll_exc_copy\@:
-	/*
-	 * Copy bytes from src until faulting load address (or until a
-	 * lb faults)
-	 *
-	 * When reached by a faulting LDFIRST/LDREST, THREAD_BUADDR($28)
-	 * may be more than a byte beyond the last address.
-	 * Hence, the lb below may get an exception.
-	 *
-	 * Assumes src < THREAD_BUADDR($28)
-	 */
-	LOADK	t0, TI_TASK($28)
-	 li	t2, SHIFT_START
-	LOADK	t0, THREAD_BUADDR(t0)
-1:
-	LOADBU(t1, 0(src), .Ll_exc\@)
-	ADD	src, src, 1
-	sb	t1, 0(dst)	# can't fault -- we're copy_from_user
-	SLLV	t1, t1, t2
-	addu	t2, SHIFT_INC
-	ADDC(sum, t1)
-	.set	reorder				/* DADDI_WAR */
-	ADD	dst, dst, 1
-	bne	src, t0, 1b
-	.set	noreorder
-.Ll_exc\@:
-	LOADK	t0, TI_TASK($28)
-	 nop
-	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
-	 nop
-	SUB	len, AT, t0		# len number of uncopied bytes
-	/*
-	 * Here's where we rely on src and dst being incremented in tandem,
-	 *   See (3) above.
-	 * dst += (fault addr - src) to put dst at first byte to clear
-	 */
-	ADD	dst, t0			# compute start address in a1
-	SUB	dst, src
-	/*
-	 * Clear len bytes starting at dst.  Can't call __bzero because it
-	 * might modify len.  An inefficient loop for these rare times...
-	 */
-	.set	reorder				/* DADDI_WAR */
-	SUB	src, len, 1
-	beqz	len, .Ldone\@
-	.set	noreorder
-1:	sb	zero, 0(dst)
-	ADD	dst, dst, 1
-	.set	push
-	.set	noat
-#ifndef CONFIG_CPU_DADDI_WORKAROUNDS
-	bnez	src, 1b
-	 SUB	src, src, 1
-#else
-	li	v1, 1
-	bnez	src, 1b
-	 SUB	src, src, v1
-#endif
-	li	v1, -EFAULT
-	b	.Ldone\@
-	 sw	v1, (errptr)
-
-.Ls_exc\@:
-	li	v0, -1 /* invalid checksum */
-	li	v1, -EFAULT
+	.set noreorder
+.L_exc:
 	jr	ra
-	 sw	v1, (errptr)
-	.set	pop
-	.endm
+	 li	v0, 0
 
+FEXPORT(__csum_partial_copy_nocheck)
+EXPORT_SYMBOL(__csum_partial_copy_nocheck)
 #ifndef CONFIG_EVA
 FEXPORT(__csum_partial_copy_to_user)
 EXPORT_SYMBOL(__csum_partial_copy_to_user)
 FEXPORT(__csum_partial_copy_from_user)
 EXPORT_SYMBOL(__csum_partial_copy_from_user)
 #endif
-__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP 1
+__BUILD_CSUM_PARTIAL_COPY_USER LEGACY_MODE USEROP USEROP
 
 #ifdef CONFIG_EVA
 LEAF(__csum_partial_copy_to_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE KERNELOP USEROP
 END(__csum_partial_copy_to_user)
 
 LEAF(__csum_partial_copy_from_user)
-__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP 0
+__BUILD_CSUM_PARTIAL_COPY_USER EVA_MODE USEROP KERNELOP
 END(__csum_partial_copy_from_user)
 #endif
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 17/20] xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (14 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 16/20] mips: propagate the calling convention change down into __csum_partial_copy_..._user() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 18/20] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
                       ` (2 subsequent siblings)
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

turn the exception handlers into returning 0.

Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/xtensa/include/asm/checksum.h | 20 +++---------
 arch/xtensa/lib/checksum.S         | 67 +++++++++-----------------------------
 2 files changed, 19 insertions(+), 68 deletions(-)

diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index fe78fba7bd64..44ec1d0b2a35 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -37,9 +37,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 /*
@@ -49,7 +47,7 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 static inline
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
@@ -57,14 +55,9 @@ static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 				   int len)
 {
-	int err = 0;
-
 	if (!access_ok(src, len))
 		return 0;
-
-	sum = csum_partial_copy_generic((__force const void *)src, dst,
-					len, ~0U, &err, NULL);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 
 /*
@@ -247,13 +240,8 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
 					       void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic(src, (__force void *)dst, len);
 }
 #endif
diff --git a/arch/xtensa/lib/checksum.S b/arch/xtensa/lib/checksum.S
index 4cb9ca58d9ad..cf1bed1a5bd6 100644
--- a/arch/xtensa/lib/checksum.S
+++ b/arch/xtensa/lib/checksum.S
@@ -175,19 +175,14 @@ ENDPROC(csum_partial)
  */
 
 /*
-unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
-					int sum, int *src_err_ptr, int *dst_err_ptr)
+unsigned int csum_partial_copy_generic (const char *src, char *dst, int len)
 	a2  = src
 	a3  = dst
 	a4  = len
 	a5  = sum
-	a6  = src_err_ptr
-	a7  = dst_err_ptr
 	a8  = temp
 	a9  = temp
 	a10 = temp
-	a11 = original len for exception handling
-	a12 = original dst for exception handling
 
     This function is optimized for 4-byte aligned addresses.  Other
     alignments work, but not nearly as efficiently.
@@ -196,8 +191,7 @@ unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
 ENTRY(csum_partial_copy_generic)
 
 	abi_entry_default
-	mov	a12, a3
-	mov	a11, a4
+	movi	a5, -1
 	or	a10, a2, a3
 
 	/* We optimize the following alignment tests for the 4-byte
@@ -228,26 +222,26 @@ ENTRY(csum_partial_copy_generic)
 #endif
 EX(10f)	l32i	a9, a2, 0
 EX(10f)	l32i	a8, a2, 4
-EX(11f)	s32i	a9, a3, 0
-EX(11f)	s32i	a8, a3, 4
+EX(10f)	s32i	a9, a3, 0
+EX(10f)	s32i	a8, a3, 4
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 8
 EX(10f)	l32i	a8, a2, 12
-EX(11f)	s32i	a9, a3, 8
-EX(11f)	s32i	a8, a3, 12
+EX(10f)	s32i	a9, a3, 8
+EX(10f)	s32i	a8, a3, 12
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 16
 EX(10f)	l32i	a8, a2, 20
-EX(11f)	s32i	a9, a3, 16
-EX(11f)	s32i	a8, a3, 20
+EX(10f)	s32i	a9, a3, 16
+EX(10f)	s32i	a8, a3, 20
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 24
 EX(10f)	l32i	a8, a2, 28
-EX(11f)	s32i	a9, a3, 24
-EX(11f)	s32i	a8, a3, 28
+EX(10f)	s32i	a9, a3, 24
+EX(10f)	s32i	a8, a3, 28
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 	addi	a2, a2, 32
@@ -267,7 +261,7 @@ EX(11f)	s32i	a8, a3, 28
 .Loop6:
 #endif
 EX(10f)	l32i	a9, a2, 0
-EX(11f)	s32i	a9, a3, 0
+EX(10f)	s32i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 4
 	addi	a3, a3, 4
@@ -298,7 +292,7 @@ EX(11f)	s32i	a9, a3, 0
 .Loop7:
 #endif
 EX(10f)	l16ui	a9, a2, 0
-EX(11f)	s16i	a9, a3, 0
+EX(10f)	s16i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 2
 	addi	a3, a3, 2
@@ -309,7 +303,7 @@ EX(11f)	s16i	a9, a3, 0
 	/* This section processes a possible trailing odd byte. */
 	_bbci.l	a4, 0, 8f	/* 1-byte chunk */
 EX(10f)	l8ui	a9, a2, 0
-EX(11f)	s8i	a9, a3, 0
+EX(10f)	s8i	a9, a3, 0
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* shift byte to bits 8..15 */
 #endif
@@ -334,8 +328,8 @@ EX(11f)	s8i	a9, a3, 0
 #endif
 EX(10f)	l8ui	a9, a2, 0
 EX(10f)	l8ui	a8, a2, 1
-EX(11f)	s8i	a9, a3, 0
-EX(11f)	s8i	a8, a3, 1
+EX(10f)	s8i	a9, a3, 0
+EX(10f)	s8i	a8, a3, 1
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* combine into a single 16-bit value */
 #else				/* for checksum computation */
@@ -356,38 +350,7 @@ ENDPROC(csum_partial_copy_generic)
 
 # Exception handler:
 .section .fixup, "ax"
-/*
-	a6  = src_err_ptr
-	a7  = dst_err_ptr
-	a11 = original len for exception handling
-	a12 = original dst for exception handling
-*/
-
 10:
-	_movi	a2, -EFAULT
-	s32i	a2, a6, 0	/* src_err_ptr */
-
-	# clear the complete destination - computing the rest
-	# is too much work
-	movi	a2, 0
-#if XCHAL_HAVE_LOOPS
-	loopgtz	a11, 2f
-#else
-	beqz	a11, 2f
-	add	a11, a11, a12	/* a11 = ending address */
-.Leloop:
-#endif
-	s8i	a2, a12, 0
-	addi	a12, a12, 1
-#if !XCHAL_HAVE_LOOPS
-	blt	a12, a11, .Leloop
-#endif
-2:
-	abi_ret_default
-
-11:
-	movi	a2, -EFAULT
-	s32i	a2, a7, 0	/* dst_err_ptr */
 	movi	a2, 0
 	abi_ret_default
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 17/20] xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
  2020-07-24  1:25     ` [PATCH v2 17/20] xtensa: propagate the calling conventions change down into csum_partial_copy_generic() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

turn the exception handlers into returning 0.

Reviewed-by: Max Filippov <jcmvbkbc@gmail.com>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/xtensa/include/asm/checksum.h | 20 +++---------
 arch/xtensa/lib/checksum.S         | 67 +++++++++-----------------------------
 2 files changed, 19 insertions(+), 68 deletions(-)

diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index fe78fba7bd64..44ec1d0b2a35 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -37,9 +37,7 @@ asmlinkage __wsum csum_partial(const void *buff, int len, __wsum sum);
  * better 64-bit) boundary
  */
 
-asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
-					    int len, __wsum sum,
-					    int *src_err_ptr, int *dst_err_ptr);
+asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 /*
@@ -49,7 +47,7 @@ asmlinkage __wsum csum_partial_copy_generic(const void *src, void *dst,
 static inline
 __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
@@ -57,14 +55,9 @@ static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 				   int len)
 {
-	int err = 0;
-
 	if (!access_ok(src, len))
 		return 0;
-
-	sum = csum_partial_copy_generic((__force const void *)src, dst,
-					len, ~0U, &err, NULL);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic((__force const void *)src, dst, len);
 }
 
 /*
@@ -247,13 +240,8 @@ static __inline__ __sum16 csum_ipv6_magic(const struct in6_addr *saddr,
 static __inline__ __wsum csum_and_copy_to_user(const void *src,
 					       void __user *dst, int len)
 {
-	int err = 0;
-	__wsum sum = ~0U;
-
 	if (!access_ok(dst, len))
 		return 0;
-
-	sum = csum_partial_copy_generic(src,dst,len,sum,NULL,&err);
-	return err ? 0 : sum;
+	return csum_partial_copy_generic(src, (__force void *)dst, len);
 }
 #endif
diff --git a/arch/xtensa/lib/checksum.S b/arch/xtensa/lib/checksum.S
index 4cb9ca58d9ad..cf1bed1a5bd6 100644
--- a/arch/xtensa/lib/checksum.S
+++ b/arch/xtensa/lib/checksum.S
@@ -175,19 +175,14 @@ ENDPROC(csum_partial)
  */
 
 /*
-unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
-					int sum, int *src_err_ptr, int *dst_err_ptr)
+unsigned int csum_partial_copy_generic (const char *src, char *dst, int len)
 	a2  = src
 	a3  = dst
 	a4  = len
 	a5  = sum
-	a6  = src_err_ptr
-	a7  = dst_err_ptr
 	a8  = temp
 	a9  = temp
 	a10 = temp
-	a11 = original len for exception handling
-	a12 = original dst for exception handling
 
     This function is optimized for 4-byte aligned addresses.  Other
     alignments work, but not nearly as efficiently.
@@ -196,8 +191,7 @@ unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
 ENTRY(csum_partial_copy_generic)
 
 	abi_entry_default
-	mov	a12, a3
-	mov	a11, a4
+	movi	a5, -1
 	or	a10, a2, a3
 
 	/* We optimize the following alignment tests for the 4-byte
@@ -228,26 +222,26 @@ ENTRY(csum_partial_copy_generic)
 #endif
 EX(10f)	l32i	a9, a2, 0
 EX(10f)	l32i	a8, a2, 4
-EX(11f)	s32i	a9, a3, 0
-EX(11f)	s32i	a8, a3, 4
+EX(10f)	s32i	a9, a3, 0
+EX(10f)	s32i	a8, a3, 4
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 8
 EX(10f)	l32i	a8, a2, 12
-EX(11f)	s32i	a9, a3, 8
-EX(11f)	s32i	a8, a3, 12
+EX(10f)	s32i	a9, a3, 8
+EX(10f)	s32i	a8, a3, 12
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 16
 EX(10f)	l32i	a8, a2, 20
-EX(11f)	s32i	a9, a3, 16
-EX(11f)	s32i	a8, a3, 20
+EX(10f)	s32i	a9, a3, 16
+EX(10f)	s32i	a8, a3, 20
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 EX(10f)	l32i	a9, a2, 24
 EX(10f)	l32i	a8, a2, 28
-EX(11f)	s32i	a9, a3, 24
-EX(11f)	s32i	a8, a3, 28
+EX(10f)	s32i	a9, a3, 24
+EX(10f)	s32i	a8, a3, 28
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 	addi	a2, a2, 32
@@ -267,7 +261,7 @@ EX(11f)	s32i	a8, a3, 28
 .Loop6:
 #endif
 EX(10f)	l32i	a9, a2, 0
-EX(11f)	s32i	a9, a3, 0
+EX(10f)	s32i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 4
 	addi	a3, a3, 4
@@ -298,7 +292,7 @@ EX(11f)	s32i	a9, a3, 0
 .Loop7:
 #endif
 EX(10f)	l16ui	a9, a2, 0
-EX(11f)	s16i	a9, a3, 0
+EX(10f)	s16i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 2
 	addi	a3, a3, 2
@@ -309,7 +303,7 @@ EX(11f)	s16i	a9, a3, 0
 	/* This section processes a possible trailing odd byte. */
 	_bbci.l	a4, 0, 8f	/* 1-byte chunk */
 EX(10f)	l8ui	a9, a2, 0
-EX(11f)	s8i	a9, a3, 0
+EX(10f)	s8i	a9, a3, 0
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* shift byte to bits 8..15 */
 #endif
@@ -334,8 +328,8 @@ EX(11f)	s8i	a9, a3, 0
 #endif
 EX(10f)	l8ui	a9, a2, 0
 EX(10f)	l8ui	a8, a2, 1
-EX(11f)	s8i	a9, a3, 0
-EX(11f)	s8i	a8, a3, 1
+EX(10f)	s8i	a9, a3, 0
+EX(10f)	s8i	a8, a3, 1
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* combine into a single 16-bit value */
 #else				/* for checksum computation */
@@ -356,38 +350,7 @@ ENDPROC(csum_partial_copy_generic)
 
 # Exception handler:
 .section .fixup, "ax"
-/*
-	a6  = src_err_ptr
-	a7  = dst_err_ptr
-	a11 = original len for exception handling
-	a12 = original dst for exception handling
-*/
-
 10:
-	_movi	a2, -EFAULT
-	s32i	a2, a6, 0	/* src_err_ptr */
-
-	# clear the complete destination - computing the rest
-	# is too much work
-	movi	a2, 0
-#if XCHAL_HAVE_LOOPS
-	loopgtz	a11, 2f
-#else
-	beqz	a11, 2f
-	add	a11, a11, a12	/* a11 = ending address */
-.Leloop:
-#endif
-	s8i	a2, a12, 0
-	addi	a12, a12, 1
-#if !XCHAL_HAVE_LOOPS
-	blt	a12, a11, .Leloop
-#endif
-2:
-	abi_ret_default
-
-11:
-	movi	a2, -EFAULT
-	s32i	a2, a7, 0	/* dst_err_ptr */
 	movi	a2, 0
 	abi_ret_default
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 18/20] sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (15 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 17/20] xtensa: propagate the calling conventions change down into csum_partial_copy_generic() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 19/20] amd64: switch csum_partial_copy_generic() to new calling conventions Al Viro
  2020-07-24  1:25     ` [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic() Al Viro
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and rename them into csum_and_copy_...() - the wrappers become pointless.
[braino fixed]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum.h    |  1 +
 arch/sparc/include/asm/checksum_32.h |  2 --
 arch/sparc/include/asm/checksum_64.h | 41 +++---------------------------------
 arch/sparc/lib/csum_copy.S           |  5 +++--
 arch/sparc/lib/csum_copy_from_user.S |  4 ++--
 arch/sparc/lib/csum_copy_to_user.S   |  4 ++--
 6 files changed, 11 insertions(+), 46 deletions(-)

diff --git a/arch/sparc/include/asm/checksum.h b/arch/sparc/include/asm/checksum.h
index deb4fe5aeafd..f2ac13323b6d 100644
--- a/arch/sparc/include/asm/checksum.h
+++ b/arch/sparc/include/asm/checksum.h
@@ -3,6 +3,7 @@
 #define ___ASM_SPARC_CHECKSUM_H
 #define _HAVE_ARCH_CSUM_AND_COPY
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define HAVE_CSUM_COPY_USER
 #if defined(__sparc__) && defined(__arch64__)
 #include <asm/checksum_64.h>
 #else
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d55e480172a6..ce11e0ad80c7 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -67,8 +67,6 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
-#define HAVE_CSUM_COPY_USER
-
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 4d0bbff43e62..d6b59461e064 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,44 +38,9 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
-
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
-{
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
-}
-
-long __csum_partial_copy_from_user(const void __user *src,
-				   void *dst, int len,
-				   __wsum sum);
-
-static inline __wsum
-csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len)
-{
-	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
-
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-long __csum_partial_copy_to_user(const void *src,
-				 void __user *dst, int len,
-				 __wsum sum);
-
-static inline __wsum
-csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len)
-{
-	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
  * the majority of the time.
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 72c900d21b12..0c0268e77155 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	__csum_partial_copy_nocheck
+#define FUNC_NAME	csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
@@ -68,9 +68,10 @@
 	.globl		FUNC_NAME
 	.type		FUNC_NAME,#function
 	EXPORT_SYMBOL(FUNC_NAME)
-FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len, %o3=sum */
+FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len */
 	LOAD(prefetch, %o0 + 0x000, #n_reads)
 	xor		%o0, %o1, %g1
+	mov		1, %o3
 	clr		%o4
 	andcc		%g1, 0x3, %g0
 	bne,pn		%icc, 95f
diff --git a/arch/sparc/lib/csum_copy_from_user.S b/arch/sparc/lib/csum_copy_from_user.S
index d20b9594f0c7..b0ba8d4dd439 100644
--- a/arch/sparc/lib/csum_copy_from_user.S
+++ b/arch/sparc/lib/csum_copy_from_user.S
@@ -9,14 +9,14 @@
 	.section .fixup, "ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_from_user
+#define FUNC_NAME		csum_and_copy_from_user
 #define LOAD(type,addr,dest)	type##a [addr] %asi, dest
 
 #include "csum_copy.S"
diff --git a/arch/sparc/lib/csum_copy_to_user.S b/arch/sparc/lib/csum_copy_to_user.S
index d71c0c81e8ab..91ba36dbf7d2 100644
--- a/arch/sparc/lib/csum_copy_to_user.S
+++ b/arch/sparc/lib/csum_copy_to_user.S
@@ -9,14 +9,14 @@
 	.section .fixup,"ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_to_user
+#define FUNC_NAME		csum_and_copy_to_user
 #define STORE(type,src,addr)	type##a src, [addr] %asi
 
 #include "csum_copy.S"
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 18/20] sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
  2020-07-24  1:25     ` [PATCH v2 18/20] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and rename them into csum_and_copy_...() - the wrappers become pointless.
[braino fixed]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/sparc/include/asm/checksum.h    |  1 +
 arch/sparc/include/asm/checksum_32.h |  2 --
 arch/sparc/include/asm/checksum_64.h | 41 +++---------------------------------
 arch/sparc/lib/csum_copy.S           |  5 +++--
 arch/sparc/lib/csum_copy_from_user.S |  4 ++--
 arch/sparc/lib/csum_copy_to_user.S   |  4 ++--
 6 files changed, 11 insertions(+), 46 deletions(-)

diff --git a/arch/sparc/include/asm/checksum.h b/arch/sparc/include/asm/checksum.h
index deb4fe5aeafd..f2ac13323b6d 100644
--- a/arch/sparc/include/asm/checksum.h
+++ b/arch/sparc/include/asm/checksum.h
@@ -3,6 +3,7 @@
 #define ___ASM_SPARC_CHECKSUM_H
 #define _HAVE_ARCH_CSUM_AND_COPY
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
+#define HAVE_CSUM_COPY_USER
 #if defined(__sparc__) && defined(__arch64__)
 #include <asm/checksum_64.h>
 #else
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index d55e480172a6..ce11e0ad80c7 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -67,8 +67,6 @@ csum_and_copy_from_user(const void __user *src, void *dst, int len)
 	return csum_partial_copy_nocheck((__force void *)src, dst, len);
 }
 
-#define HAVE_CSUM_COPY_USER
-
 static inline __wsum
 csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
diff --git a/arch/sparc/include/asm/checksum_64.h b/arch/sparc/include/asm/checksum_64.h
index 4d0bbff43e62..d6b59461e064 100644
--- a/arch/sparc/include/asm/checksum_64.h
+++ b/arch/sparc/include/asm/checksum_64.h
@@ -38,44 +38,9 @@ __wsum csum_partial(const void * buff, int len, __wsum sum);
  * here even more important to align src and dst on a 32-bit (or even
  * better 64-bit) boundary
  */
-__wsum __csum_partial_copy_nocheck(const void *src, void *dst, int len, __wsum sum);
-
-static inline __wsum csum_partial_copy_nocheck(const void *src, void *dst, int len)
-{
-	return __csum_partial_copy_nocheck(src, dst, len, 0);
-}
-
-long __csum_partial_copy_from_user(const void __user *src,
-				   void *dst, int len,
-				   __wsum sum);
-
-static inline __wsum
-csum_and_copy_from_user(const void __user *src,
-			    void *dst, int len)
-{
-	long ret = __csum_partial_copy_from_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
-
-/*
- *	Copy and checksum to user
- */
-#define HAVE_CSUM_COPY_USER
-long __csum_partial_copy_to_user(const void *src,
-				 void __user *dst, int len,
-				 __wsum sum);
-
-static inline __wsum
-csum_and_copy_to_user(const void *src,
-		      void __user *dst, int len)
-{
-	long ret = __csum_partial_copy_to_user(src, dst, len, ~0U);
-	if (ret < 0)
-		return 0;
-	return (__force __wsum) ret;
-}
+__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len);
+__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
+__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
 
 /* ihl is always 5 or greater, almost always is 5, and iph is word aligned
  * the majority of the time.
diff --git a/arch/sparc/lib/csum_copy.S b/arch/sparc/lib/csum_copy.S
index 72c900d21b12..0c0268e77155 100644
--- a/arch/sparc/lib/csum_copy.S
+++ b/arch/sparc/lib/csum_copy.S
@@ -33,7 +33,7 @@
 #endif
 
 #ifndef FUNC_NAME
-#define FUNC_NAME	__csum_partial_copy_nocheck
+#define FUNC_NAME	csum_partial_copy_nocheck
 #endif
 
 	.register	%g2, #scratch
@@ -68,9 +68,10 @@
 	.globl		FUNC_NAME
 	.type		FUNC_NAME,#function
 	EXPORT_SYMBOL(FUNC_NAME)
-FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len, %o3=sum */
+FUNC_NAME:		/* %o0=src, %o1=dst, %o2=len */
 	LOAD(prefetch, %o0 + 0x000, #n_reads)
 	xor		%o0, %o1, %g1
+	mov		1, %o3
 	clr		%o4
 	andcc		%g1, 0x3, %g0
 	bne,pn		%icc, 95f
diff --git a/arch/sparc/lib/csum_copy_from_user.S b/arch/sparc/lib/csum_copy_from_user.S
index d20b9594f0c7..b0ba8d4dd439 100644
--- a/arch/sparc/lib/csum_copy_from_user.S
+++ b/arch/sparc/lib/csum_copy_from_user.S
@@ -9,14 +9,14 @@
 	.section .fixup, "ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_from_user
+#define FUNC_NAME		csum_and_copy_from_user
 #define LOAD(type,addr,dest)	type##a [addr] %asi, dest
 
 #include "csum_copy.S"
diff --git a/arch/sparc/lib/csum_copy_to_user.S b/arch/sparc/lib/csum_copy_to_user.S
index d71c0c81e8ab..91ba36dbf7d2 100644
--- a/arch/sparc/lib/csum_copy_to_user.S
+++ b/arch/sparc/lib/csum_copy_to_user.S
@@ -9,14 +9,14 @@
 	.section .fixup,"ax";	\
 	.align 4;		\
 99:	retl;			\
-	 mov	-1, %o0;	\
+	 mov	0, %o0;		\
 	.section __ex_table,"a";\
 	.align 4;		\
 	.word 98b, 99b;		\
 	.text;			\
 	.align 4;
 
-#define FUNC_NAME		__csum_partial_copy_to_user
+#define FUNC_NAME		csum_and_copy_to_user
 #define STORE(type,src,addr)	type##a src, [addr] %asi
 
 #include "csum_copy.S"
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 19/20] amd64: switch csum_partial_copy_generic() to new calling conventions
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (16 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 18/20] sparc64: propagate the calling convention changes down to __csum_partial_copy_...() Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-07-24  1:25     ` [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic() Al Viro
  18 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and fold handling of misaligned case into it.

Implementation note: we stash the "will we need to rol8 the sum in the end"
flag into the MSB of %rcx (the lower 32 bits are used for length); the rest
is pretty straightforward.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_64.h |   5 +-
 arch/x86/lib/csum-copy_64.S        | 140 ++++++++++++++++++++++---------------
 arch/x86/lib/csum-wrappers_64.c    |  72 +++----------------
 3 files changed, 94 insertions(+), 123 deletions(-)

diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 9af3aed54c6b..407beebadaf4 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -130,10 +130,7 @@ static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
 /* Do not call this directly. Use the wrappers below */
-extern __visible __wsum csum_partial_copy_generic(const void *src, const void *dst,
-					int len, __wsum sum,
-					int *src_err_ptr, int *dst_err_ptr);
-
+extern __visible __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
diff --git a/arch/x86/lib/csum-copy_64.S b/arch/x86/lib/csum-copy_64.S
index 3394a8ff7fd0..1fbd8ee9642d 100644
--- a/arch/x86/lib/csum-copy_64.S
+++ b/arch/x86/lib/csum-copy_64.S
@@ -18,9 +18,6 @@
  * rdi  source
  * rsi  destination
  * edx  len (32bit)
- * ecx  sum (32bit)
- * r8   src_err_ptr (int)
- * r9   dst_err_ptr (int)
  *
  * Output
  * eax  64bit sum. undefined in case of exception.
@@ -31,44 +28,32 @@
 
 	.macro source
 10:
-	_ASM_EXTABLE_UA(10b, .Lbad_source)
+	_ASM_EXTABLE_UA(10b, .Lfault)
 	.endm
 
 	.macro dest
 20:
-	_ASM_EXTABLE_UA(20b, .Lbad_dest)
+	_ASM_EXTABLE_UA(20b, .Lfault)
 	.endm
 
-	/*
-	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
-	 * potentially unmapped kernel address.
-	 */
-	.macro ignore L=.Lignore
-30:
-	_ASM_EXTABLE(30b, \L)
-	.endm
-
-
 SYM_FUNC_START(csum_partial_copy_generic)
-	cmpl	$3*64, %edx
-	jle	.Lignore
-
-.Lignore:
-	subq  $7*8, %rsp
-	movq  %rbx, 2*8(%rsp)
-	movq  %r12, 3*8(%rsp)
-	movq  %r14, 4*8(%rsp)
-	movq  %r13, 5*8(%rsp)
-	movq  %r15, 6*8(%rsp)
+	subq  $5*8, %rsp
+	movq  %rbx, 0*8(%rsp)
+	movq  %r12, 1*8(%rsp)
+	movq  %r14, 2*8(%rsp)
+	movq  %r13, 3*8(%rsp)
+	movq  %r15, 4*8(%rsp)
 
-	movq  %r8, (%rsp)
-	movq  %r9, 1*8(%rsp)
-
-	movl  %ecx, %eax
+	movl  $-1, %eax
+	xorl  %r9d, %r9d
 	movl  %edx, %ecx
+	cmpl  $8, %ecx
+	jb    .Lshort
 
-	xorl  %r9d, %r9d
-	movq  %rcx, %r12
+	testb  $7, %sil
+	jne   .Lunaligned
+.Laligned:
+	movl  %ecx, %r12d
 
 	shrq  $6, %r12
 	jz	.Lhandle_tail       /* < 64 */
@@ -99,7 +84,12 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	source
 	movq  56(%rdi), %r13
 
-	ignore 2f
+30:
+	/*
+	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
+	 * potentially unmapped kernel address.
+	 */
+	_ASM_EXTABLE(30b, 2f)
 	prefetcht0 5*64(%rdi)
 2:
 	adcq  %rbx, %rax
@@ -131,8 +121,6 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	dest
 	movq %r13, 56(%rsi)
 
-3:
-
 	leaq 64(%rdi), %rdi
 	leaq 64(%rsi), %rsi
 
@@ -142,8 +130,8 @@ SYM_FUNC_START(csum_partial_copy_generic)
 
 	/* do last up to 56 bytes */
 .Lhandle_tail:
-	/* ecx:	count */
-	movl %ecx, %r10d
+	/* ecx:	count, rcx.63: the end result needs to be rol8 */
+	movq %rcx, %r10
 	andl $63, %ecx
 	shrl $3, %ecx
 	jz	.Lfold
@@ -172,6 +160,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 .Lhandle_7:
 	movl %r10d, %ecx
 	andl $7, %ecx
+.L1:				/* .Lshort rejoins the common path here */
 	shrl $1, %ecx
 	jz   .Lhandle_1
 	movl $2, %edx
@@ -203,26 +192,65 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	adcl %r9d, %eax		/* carry */
 
 .Lende:
-	movq 2*8(%rsp), %rbx
-	movq 3*8(%rsp), %r12
-	movq 4*8(%rsp), %r14
-	movq 5*8(%rsp), %r13
-	movq 6*8(%rsp), %r15
-	addq $7*8, %rsp
+	testq %r10, %r10
+	js  .Lwas_odd
+.Lout:
+	movq 0*8(%rsp), %rbx
+	movq 1*8(%rsp), %r12
+	movq 2*8(%rsp), %r14
+	movq 3*8(%rsp), %r13
+	movq 4*8(%rsp), %r15
+	addq $5*8, %rsp
 	ret
+.Lshort:
+	movl %ecx, %r10d
+	jmp  .L1
+.Lunaligned:
+	xorl %ebx, %ebx
+	testb $1, %sil
+	jne  .Lodd
+1:	testb $2, %sil
+	je   2f
+	source
+	movw (%rdi), %bx
+	dest
+	movw %bx, (%rsi)
+	leaq 2(%rdi), %rdi
+	subq $2, %rcx
+	leaq 2(%rsi), %rsi
+	addq %rbx, %rax
+2:	testb $4, %sil
+	je .Laligned
+	source
+	movl (%rdi), %ebx
+	dest
+	movl %ebx, (%rsi)
+	leaq 4(%rdi), %rdi
+	subq $4, %rcx
+	leaq 4(%rsi), %rsi
+	addq %rbx, %rax
+	jmp .Laligned
+
+.Lodd:
+	source
+	movb (%rdi), %bl
+	dest
+	movb %bl, (%rsi)
+	leaq 1(%rdi), %rdi
+	leaq 1(%rsi), %rsi
+	/* decrement, set MSB */
+	leaq -1(%rcx, %rcx), %rcx
+	rorq $1, %rcx
+	shll $8, %ebx
+	addq %rbx, %rax
+	jmp 1b
+
+.Lwas_odd:
+	roll $8, %eax
+	jmp .Lout
 
-	/* Exception handlers. Very simple, zeroing is done in the wrappers */
-.Lbad_source:
-	movq (%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp  .Lende
-
-.Lbad_dest:
-	movq 8(%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp .Lende
+	/* Exception: just return 0 */
+.Lfault:
+	xorl %eax, %eax
+	jmp  .Lout
 SYM_FUNC_END(csum_partial_copy_generic)
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ae2fb87e2274..189344924a2b 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -21,49 +21,16 @@
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum isum = ~0U;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(src, len))
 		return 0;
-
-	/*
-	 * Why 6, not 7? To handle odd addresses aligned we
-	 * would need to do considerable complications to fix the
-	 * checksum which is defined as an 16bit accumulator. The
-	 * fix alignment code is primarily for performance
-	 * compatibility with 32bit and that will handle odd
-	 * addresses slowly too.
-	 */
-	if (unlikely((unsigned long)src & 6)) {
-		while (((unsigned long)src & 6) && len >= 2) {
-			__u16 val16;
-
-			unsafe_get_user(val16, (const __u16 __user *)src, out);
-
-			*(__u16 *)dst = val16;
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, &err, NULL);
-	user_access_end();
-	if (unlikely(err))
-		isum = 0;
-	return isum;
-
-out:
+	sum = csum_partial_copy_generic((__force const void *)src, dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -79,37 +46,16 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	__wsum ret, isum = ~0U;
-	int err = 0;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(dst, len))
 		return 0;
-
-	if (unlikely((unsigned long)dst & 6)) {
-		while (((unsigned long)dst & 6) && len >= 2) {
-			__u16 val16 = *(__u16 *)src;
-
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			unsafe_put_user(val16, (__u16 __user *)dst, out);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-
-	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, &err);
-	user_access_end();
-	return err ? 0 : ret;
-out:
+	sum = csum_partial_copy_generic(src, (void __force *)dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
@@ -125,7 +71,7 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 19/20] amd64: switch csum_partial_copy_generic() to new calling conventions
  2020-07-24  1:25     ` [PATCH v2 19/20] amd64: switch csum_partial_copy_generic() to new calling conventions Al Viro
@ 2020-07-24  1:25       ` Al Viro
  0 siblings, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and fold handling of misaligned case into it.

Implementation note: we stash the "will we need to rol8 the sum in the end"
flag into the MSB of %rcx (the lower 32 bits are used for length); the rest
is pretty straightforward.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/x86/include/asm/checksum_64.h |   5 +-
 arch/x86/lib/csum-copy_64.S        | 140 ++++++++++++++++++++++---------------
 arch/x86/lib/csum-wrappers_64.c    |  72 +++----------------
 3 files changed, 94 insertions(+), 123 deletions(-)

diff --git a/arch/x86/include/asm/checksum_64.h b/arch/x86/include/asm/checksum_64.h
index 9af3aed54c6b..407beebadaf4 100644
--- a/arch/x86/include/asm/checksum_64.h
+++ b/arch/x86/include/asm/checksum_64.h
@@ -130,10 +130,7 @@ static inline __sum16 csum_tcpudp_magic(__be32 saddr, __be32 daddr,
 extern __wsum csum_partial(const void *buff, int len, __wsum sum);
 
 /* Do not call this directly. Use the wrappers below */
-extern __visible __wsum csum_partial_copy_generic(const void *src, const void *dst,
-					int len, __wsum sum,
-					int *src_err_ptr, int *dst_err_ptr);
-
+extern __visible __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len);
 extern __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len);
diff --git a/arch/x86/lib/csum-copy_64.S b/arch/x86/lib/csum-copy_64.S
index 3394a8ff7fd0..1fbd8ee9642d 100644
--- a/arch/x86/lib/csum-copy_64.S
+++ b/arch/x86/lib/csum-copy_64.S
@@ -18,9 +18,6 @@
  * rdi  source
  * rsi  destination
  * edx  len (32bit)
- * ecx  sum (32bit)
- * r8   src_err_ptr (int)
- * r9   dst_err_ptr (int)
  *
  * Output
  * eax  64bit sum. undefined in case of exception.
@@ -31,44 +28,32 @@
 
 	.macro source
 10:
-	_ASM_EXTABLE_UA(10b, .Lbad_source)
+	_ASM_EXTABLE_UA(10b, .Lfault)
 	.endm
 
 	.macro dest
 20:
-	_ASM_EXTABLE_UA(20b, .Lbad_dest)
+	_ASM_EXTABLE_UA(20b, .Lfault)
 	.endm
 
-	/*
-	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
-	 * potentially unmapped kernel address.
-	 */
-	.macro ignore L=.Lignore
-30:
-	_ASM_EXTABLE(30b, \L)
-	.endm
-
-
 SYM_FUNC_START(csum_partial_copy_generic)
-	cmpl	$3*64, %edx
-	jle	.Lignore
-
-.Lignore:
-	subq  $7*8, %rsp
-	movq  %rbx, 2*8(%rsp)
-	movq  %r12, 3*8(%rsp)
-	movq  %r14, 4*8(%rsp)
-	movq  %r13, 5*8(%rsp)
-	movq  %r15, 6*8(%rsp)
+	subq  $5*8, %rsp
+	movq  %rbx, 0*8(%rsp)
+	movq  %r12, 1*8(%rsp)
+	movq  %r14, 2*8(%rsp)
+	movq  %r13, 3*8(%rsp)
+	movq  %r15, 4*8(%rsp)
 
-	movq  %r8, (%rsp)
-	movq  %r9, 1*8(%rsp)
-
-	movl  %ecx, %eax
+	movl  $-1, %eax
+	xorl  %r9d, %r9d
 	movl  %edx, %ecx
+	cmpl  $8, %ecx
+	jb    .Lshort
 
-	xorl  %r9d, %r9d
-	movq  %rcx, %r12
+	testb  $7, %sil
+	jne   .Lunaligned
+.Laligned:
+	movl  %ecx, %r12d
 
 	shrq  $6, %r12
 	jz	.Lhandle_tail       /* < 64 */
@@ -99,7 +84,12 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	source
 	movq  56(%rdi), %r13
 
-	ignore 2f
+30:
+	/*
+	 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
+	 * potentially unmapped kernel address.
+	 */
+	_ASM_EXTABLE(30b, 2f)
 	prefetcht0 5*64(%rdi)
 2:
 	adcq  %rbx, %rax
@@ -131,8 +121,6 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	dest
 	movq %r13, 56(%rsi)
 
-3:
-
 	leaq 64(%rdi), %rdi
 	leaq 64(%rsi), %rsi
 
@@ -142,8 +130,8 @@ SYM_FUNC_START(csum_partial_copy_generic)
 
 	/* do last up to 56 bytes */
 .Lhandle_tail:
-	/* ecx:	count */
-	movl %ecx, %r10d
+	/* ecx:	count, rcx.63: the end result needs to be rol8 */
+	movq %rcx, %r10
 	andl $63, %ecx
 	shrl $3, %ecx
 	jz	.Lfold
@@ -172,6 +160,7 @@ SYM_FUNC_START(csum_partial_copy_generic)
 .Lhandle_7:
 	movl %r10d, %ecx
 	andl $7, %ecx
+.L1:				/* .Lshort rejoins the common path here */
 	shrl $1, %ecx
 	jz   .Lhandle_1
 	movl $2, %edx
@@ -203,26 +192,65 @@ SYM_FUNC_START(csum_partial_copy_generic)
 	adcl %r9d, %eax		/* carry */
 
 .Lende:
-	movq 2*8(%rsp), %rbx
-	movq 3*8(%rsp), %r12
-	movq 4*8(%rsp), %r14
-	movq 5*8(%rsp), %r13
-	movq 6*8(%rsp), %r15
-	addq $7*8, %rsp
+	testq %r10, %r10
+	js  .Lwas_odd
+.Lout:
+	movq 0*8(%rsp), %rbx
+	movq 1*8(%rsp), %r12
+	movq 2*8(%rsp), %r14
+	movq 3*8(%rsp), %r13
+	movq 4*8(%rsp), %r15
+	addq $5*8, %rsp
 	ret
+.Lshort:
+	movl %ecx, %r10d
+	jmp  .L1
+.Lunaligned:
+	xorl %ebx, %ebx
+	testb $1, %sil
+	jne  .Lodd
+1:	testb $2, %sil
+	je   2f
+	source
+	movw (%rdi), %bx
+	dest
+	movw %bx, (%rsi)
+	leaq 2(%rdi), %rdi
+	subq $2, %rcx
+	leaq 2(%rsi), %rsi
+	addq %rbx, %rax
+2:	testb $4, %sil
+	je .Laligned
+	source
+	movl (%rdi), %ebx
+	dest
+	movl %ebx, (%rsi)
+	leaq 4(%rdi), %rdi
+	subq $4, %rcx
+	leaq 4(%rsi), %rsi
+	addq %rbx, %rax
+	jmp .Laligned
+
+.Lodd:
+	source
+	movb (%rdi), %bl
+	dest
+	movb %bl, (%rsi)
+	leaq 1(%rdi), %rdi
+	leaq 1(%rsi), %rsi
+	/* decrement, set MSB */
+	leaq -1(%rcx, %rcx), %rcx
+	rorq $1, %rcx
+	shll $8, %ebx
+	addq %rbx, %rax
+	jmp 1b
+
+.Lwas_odd:
+	roll $8, %eax
+	jmp .Lout
 
-	/* Exception handlers. Very simple, zeroing is done in the wrappers */
-.Lbad_source:
-	movq (%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp  .Lende
-
-.Lbad_dest:
-	movq 8(%rsp), %rax
-	testq %rax, %rax
-	jz   .Lende
-	movl $-EFAULT, (%rax)
-	jmp .Lende
+	/* Exception: just return 0 */
+.Lfault:
+	xorl %eax, %eax
+	jmp  .Lout
 SYM_FUNC_END(csum_partial_copy_generic)
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index ae2fb87e2274..189344924a2b 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -21,49 +21,16 @@
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_from_user(const void __user *src, void *dst,
-			    int len)
+csum_and_copy_from_user(const void __user *src, void *dst, int len)
 {
-	int err = 0;
-	__wsum isum = ~0U;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(src, len))
 		return 0;
-
-	/*
-	 * Why 6, not 7? To handle odd addresses aligned we
-	 * would need to do considerable complications to fix the
-	 * checksum which is defined as an 16bit accumulator. The
-	 * fix alignment code is primarily for performance
-	 * compatibility with 32bit and that will handle odd
-	 * addresses slowly too.
-	 */
-	if (unlikely((unsigned long)src & 6)) {
-		while (((unsigned long)src & 6) && len >= 2) {
-			__u16 val16;
-
-			unsafe_get_user(val16, (const __u16 __user *)src, out);
-
-			*(__u16 *)dst = val16;
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-	isum = csum_partial_copy_generic((__force const void *)src,
-				dst, len, isum, &err, NULL);
-	user_access_end();
-	if (unlikely(err))
-		isum = 0;
-	return isum;
-
-out:
+	sum = csum_partial_copy_generic((__force const void *)src, dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
@@ -79,37 +46,16 @@ EXPORT_SYMBOL(csum_and_copy_from_user);
  * src and dst are best aligned to 64bits.
  */
 __wsum
-csum_and_copy_to_user(const void *src, void __user *dst,
-			  int len)
+csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	__wsum ret, isum = ~0U;
-	int err = 0;
+	__wsum sum;
 
 	might_sleep();
-
 	if (!user_access_begin(dst, len))
 		return 0;
-
-	if (unlikely((unsigned long)dst & 6)) {
-		while (((unsigned long)dst & 6) && len >= 2) {
-			__u16 val16 = *(__u16 *)src;
-
-			isum = (__force __wsum)add32_with_carry(
-					(__force unsigned)isum, val16);
-			unsafe_put_user(val16, (__u16 __user *)dst, out);
-			src += 2;
-			dst += 2;
-			len -= 2;
-		}
-	}
-
-	ret = csum_partial_copy_generic(src, (void __force *)dst,
-					len, isum, NULL, &err);
-	user_access_end();
-	return err ? 0 : ret;
-out:
+	sum = csum_partial_copy_generic(src, (void __force *)dst, len);
 	user_access_end();
-	return 0;
+	return sum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
 
@@ -125,7 +71,7 @@ EXPORT_SYMBOL(csum_and_copy_to_user);
 __wsum
 csum_partial_copy_nocheck(const void *src, void *dst, int len)
 {
-	return csum_partial_copy_generic(src, dst, len, 0, NULL, NULL);
+	return csum_partial_copy_generic(src, dst, len);
 }
 EXPORT_SYMBOL(csum_partial_copy_nocheck);
 
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-24  1:25   ` [PATCH v2 01/20] xtensa: fix access check in csum_and_copy_from_user Al Viro
                       ` (17 preceding siblings ...)
  2020-07-24  1:25     ` [PATCH v2 19/20] amd64: switch csum_partial_copy_generic() to new calling conventions Al Viro
@ 2020-07-24  1:25     ` Al Viro
  2020-07-24  1:25       ` Al Viro
  2020-10-14 22:26       ` Jason A. Donenfeld
  18 siblings, 2 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of the pointless fallback in the wrappers.  On error it used
to zero the unwritten area and calculate the csum of the entire thing.  Not
wanting to do it in assembler part had been very reasonable; doing that in
the first place, OTOH...  In case of an error the caller discards the data
we'd copied, along with whatever checksum it might've had.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/powerpc/include/asm/checksum.h  |  6 +--
 arch/powerpc/lib/checksum_32.S       | 74 +++++++++++++-----------------------
 arch/powerpc/lib/checksum_64.S       | 37 ++++++------------
 arch/powerpc/lib/checksum_wrappers.c | 32 +++-------------
 4 files changed, 46 insertions(+), 103 deletions(-)

diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index dba685d984c0..82f099ba2411 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -18,9 +18,7 @@
  * Like csum_partial, this must be called with even lengths,
  * except for the last fragment.
  */
-extern __wsum csum_partial_copy_generic(const void *src, void *dst,
-					      int len, __wsum sum,
-					      int *src_err, int *dst_err);
+extern __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
@@ -31,7 +29,7 @@ extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 #define csum_partial_copy_nocheck(src, dst, len)   \
-        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
+        csum_partial_copy_generic((src), (dst), (len))
 
 
 /*
diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
index ecd150dc3ed9..ec5cd2dede35 100644
--- a/arch/powerpc/lib/checksum_32.S
+++ b/arch/powerpc/lib/checksum_32.S
@@ -78,12 +78,10 @@ EXPORT_SYMBOL(__csum_partial)
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively, and (for an error on
- * src) zeroes the rest of dst.
+ * and adds in 0xffffffff, while copying the block to dst.
+ * If an access exception occurs it returns zero.
  *
- * csum_partial_copy_generic(src, dst, len, sum, src_err, dst_err)
+ * csum_partial_copy_generic(src, dst, len)
  */
 #define CSUM_COPY_16_BYTES_WITHEX(n)	\
 8 ## n ## 0:			\
@@ -108,14 +106,14 @@ EXPORT_SYMBOL(__csum_partial)
 	adde	r12,r12,r10
 
 #define CSUM_COPY_16_BYTES_EXCODE(n)		\
-	EX_TABLE(8 ## n ## 0b, src_error);	\
-	EX_TABLE(8 ## n ## 1b, src_error);	\
-	EX_TABLE(8 ## n ## 2b, src_error);	\
-	EX_TABLE(8 ## n ## 3b, src_error);	\
-	EX_TABLE(8 ## n ## 4b, dst_error);	\
-	EX_TABLE(8 ## n ## 5b, dst_error);	\
-	EX_TABLE(8 ## n ## 6b, dst_error);	\
-	EX_TABLE(8 ## n ## 7b, dst_error);
+	EX_TABLE(8 ## n ## 0b, fault);	\
+	EX_TABLE(8 ## n ## 1b, fault);	\
+	EX_TABLE(8 ## n ## 2b, fault);	\
+	EX_TABLE(8 ## n ## 3b, fault);	\
+	EX_TABLE(8 ## n ## 4b, fault);	\
+	EX_TABLE(8 ## n ## 5b, fault);	\
+	EX_TABLE(8 ## n ## 6b, fault);	\
+	EX_TABLE(8 ## n ## 7b, fault);
 
 	.text
 	.stabs	"arch/powerpc/lib/",N_SO,0,0,0f
@@ -127,11 +125,8 @@ LG_CACHELINE_BYTES = L1_CACHE_SHIFT
 CACHELINE_MASK = (L1_CACHE_BYTES-1)
 
 _GLOBAL(csum_partial_copy_generic)
-	stwu	r1,-16(r1)
-	stw	r7,12(r1)
-	stw	r8,8(r1)
-
-	addic	r12,r6,0
+	li	r12,-1
+	addic	r0,r0,0			/* clear carry */
 	addi	r6,r4,-4
 	neg	r0,r4
 	addi	r4,r3,-4
@@ -246,34 +241,19 @@ _GLOBAL(csum_partial_copy_generic)
 	rlwinm	r3,r3,8,0,31	/* odd destination address: rotate one byte */
 	blr
 
-/* read fault */
-src_error:
-	lwz	r7,12(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r7,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r7)
-	blr
-/* write fault */
-dst_error:
-	lwz	r8,8(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r8,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r8)
+fault:
+	li	r3,0
 	blr
 
-	EX_TABLE(70b, src_error);
-	EX_TABLE(71b, dst_error);
-	EX_TABLE(72b, src_error);
-	EX_TABLE(73b, dst_error);
-	EX_TABLE(54b, dst_error);
+	EX_TABLE(70b, fault);
+	EX_TABLE(71b, fault);
+	EX_TABLE(72b, fault);
+	EX_TABLE(73b, fault);
+	EX_TABLE(54b, fault);
 
 /*
  * this stuff handles faults in the cacheline loop and branches to either
- * src_error (if in read part) or dst_error (if in write part)
+ * fault (if in read part) or fault (if in write part)
  */
 	CSUM_COPY_16_BYTES_EXCODE(0)
 #if L1_CACHE_BYTES >= 32
@@ -290,12 +270,12 @@ dst_error:
 #endif
 #endif
 
-	EX_TABLE(30b, src_error);
-	EX_TABLE(31b, dst_error);
-	EX_TABLE(40b, src_error);
-	EX_TABLE(41b, dst_error);
-	EX_TABLE(50b, src_error);
-	EX_TABLE(51b, dst_error);
+	EX_TABLE(30b, fault);
+	EX_TABLE(31b, fault);
+	EX_TABLE(40b, fault);
+	EX_TABLE(41b, fault);
+	EX_TABLE(50b, fault);
+	EX_TABLE(51b, fault);
 
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
diff --git a/arch/powerpc/lib/checksum_64.S b/arch/powerpc/lib/checksum_64.S
index 514978f908d4..98ff51bd2f7d 100644
--- a/arch/powerpc/lib/checksum_64.S
+++ b/arch/powerpc/lib/checksum_64.S
@@ -182,34 +182,33 @@ EXPORT_SYMBOL(__csum_partial)
 
 	.macro srcnr
 100:
-	EX_TABLE(100b,.Lsrc_error_nr)
+	EX_TABLE(100b,.Lerror_nr)
 	.endm
 
 	.macro source
 150:
-	EX_TABLE(150b,.Lsrc_error)
+	EX_TABLE(150b,.Lerror)
 	.endm
 
 	.macro dstnr
 200:
-	EX_TABLE(200b,.Ldest_error_nr)
+	EX_TABLE(200b,.Lerror_nr)
 	.endm
 
 	.macro dest
 250:
-	EX_TABLE(250b,.Ldest_error)
+	EX_TABLE(250b,.Lerror)
 	.endm
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively. The caller must take any action
- * required in this case (zeroing memory, recalculating partial checksum etc).
+ * and adds in 0xffffffff (32-bit), while copying the block to dst.
+ * If an access exception occurs, it returns 0.
  *
- * csum_partial_copy_generic(r3=src, r4=dst, r5=len, r6=sum, r7=src_err, r8=dst_err)
+ * csum_partial_copy_generic(r3=src, r4=dst, r5=len)
  */
 _GLOBAL(csum_partial_copy_generic)
+	li	r6,-1
 	addic	r0,r6,0			/* clear carry */
 
 	srdi.	r6,r5,3			/* less than 8 bytes? */
@@ -401,29 +400,15 @@ dstnr;	stb	r6,0(r4)
 	srdi	r3,r3,32
 	blr
 
-.Lsrc_error:
+.Lerror:
 	ld	r14,STK_REG(R14)(r1)
 	ld	r15,STK_REG(R15)(r1)
 	ld	r16,STK_REG(R16)(r1)
 	addi	r1,r1,STACKFRAMESIZE
-.Lsrc_error_nr:
-	cmpdi	0,r7,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r7)
+.Lerror_nr:
+	li	r3,0
 	blr
 
-.Ldest_error:
-	ld	r14,STK_REG(R14)(r1)
-	ld	r15,STK_REG(R15)(r1)
-	ld	r16,STK_REG(R16)(r1)
-	addi	r1,r1,STACKFRAMESIZE
-.Ldest_error_nr:
-	cmpdi	0,r8,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r8)
-	blr
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
 /*
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index b1faa82dd8af..b895166afc82 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -14,8 +14,7 @@
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 			       int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 
@@ -24,27 +23,16 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 
 	allow_read_from_user(src, len);
 
-	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, ~0U, &err, NULL);
-
-	if (unlikely(err)) {
-		int missing = __copy_from_user(dst, src, len);
-
-		if (missing)
-			csum = 0;
-		else
-			csum = csum_partial(dst, len, ~0U);
-	}
+	csum = csum_partial_copy_generic((void __force *)src, dst, len);
 
 	prevent_read_from_user(src, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 	if (unlikely(!access_ok(dst, len)))
@@ -52,17 +40,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 
 	allow_write_to_user(dst, len);
 
-	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, ~0U, NULL, &err);
-
-	if (unlikely(err)) {
-		csum = csum_partial(src, len, ~0U);
-
-		if (copy_to_user(dst, src, len))
-			csum = 0;
-	}
+	csum = csum_partial_copy_generic(src, (void __force *)dst, len);
 
 	prevent_write_to_user(dst, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()
  2020-07-24  1:25     ` [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic() Al Viro
@ 2020-07-24  1:25       ` Al Viro
  2020-10-14 22:26       ` Jason A. Donenfeld
  1 sibling, 0 replies; 102+ messages in thread
From: Al Viro @ 2020-07-24  1:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-arch

From: Al Viro <viro@zeniv.linux.org.uk>

... and get rid of the pointless fallback in the wrappers.  On error it used
to zero the unwritten area and calculate the csum of the entire thing.  Not
wanting to do it in assembler part had been very reasonable; doing that in
the first place, OTOH...  In case of an error the caller discards the data
we'd copied, along with whatever checksum it might've had.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/powerpc/include/asm/checksum.h  |  6 +--
 arch/powerpc/lib/checksum_32.S       | 74 +++++++++++++-----------------------
 arch/powerpc/lib/checksum_64.S       | 37 ++++++------------
 arch/powerpc/lib/checksum_wrappers.c | 32 +++-------------
 4 files changed, 46 insertions(+), 103 deletions(-)

diff --git a/arch/powerpc/include/asm/checksum.h b/arch/powerpc/include/asm/checksum.h
index dba685d984c0..82f099ba2411 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -18,9 +18,7 @@
  * Like csum_partial, this must be called with even lengths,
  * except for the last fragment.
  */
-extern __wsum csum_partial_copy_generic(const void *src, void *dst,
-					      int len, __wsum sum,
-					      int *src_err, int *dst_err);
+extern __wsum csum_partial_copy_generic(const void *src, void *dst, int len);
 
 #define _HAVE_ARCH_COPY_AND_CSUM_FROM_USER
 extern __wsum csum_and_copy_from_user(const void __user *src, void *dst,
@@ -31,7 +29,7 @@ extern __wsum csum_and_copy_to_user(const void *src, void __user *dst,
 
 #define _HAVE_ARCH_CSUM_AND_COPY
 #define csum_partial_copy_nocheck(src, dst, len)   \
-        csum_partial_copy_generic((src), (dst), (len), 0, NULL, NULL)
+        csum_partial_copy_generic((src), (dst), (len))
 
 
 /*
diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
index ecd150dc3ed9..ec5cd2dede35 100644
--- a/arch/powerpc/lib/checksum_32.S
+++ b/arch/powerpc/lib/checksum_32.S
@@ -78,12 +78,10 @@ EXPORT_SYMBOL(__csum_partial)
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively, and (for an error on
- * src) zeroes the rest of dst.
+ * and adds in 0xffffffff, while copying the block to dst.
+ * If an access exception occurs it returns zero.
  *
- * csum_partial_copy_generic(src, dst, len, sum, src_err, dst_err)
+ * csum_partial_copy_generic(src, dst, len)
  */
 #define CSUM_COPY_16_BYTES_WITHEX(n)	\
 8 ## n ## 0:			\
@@ -108,14 +106,14 @@ EXPORT_SYMBOL(__csum_partial)
 	adde	r12,r12,r10
 
 #define CSUM_COPY_16_BYTES_EXCODE(n)		\
-	EX_TABLE(8 ## n ## 0b, src_error);	\
-	EX_TABLE(8 ## n ## 1b, src_error);	\
-	EX_TABLE(8 ## n ## 2b, src_error);	\
-	EX_TABLE(8 ## n ## 3b, src_error);	\
-	EX_TABLE(8 ## n ## 4b, dst_error);	\
-	EX_TABLE(8 ## n ## 5b, dst_error);	\
-	EX_TABLE(8 ## n ## 6b, dst_error);	\
-	EX_TABLE(8 ## n ## 7b, dst_error);
+	EX_TABLE(8 ## n ## 0b, fault);	\
+	EX_TABLE(8 ## n ## 1b, fault);	\
+	EX_TABLE(8 ## n ## 2b, fault);	\
+	EX_TABLE(8 ## n ## 3b, fault);	\
+	EX_TABLE(8 ## n ## 4b, fault);	\
+	EX_TABLE(8 ## n ## 5b, fault);	\
+	EX_TABLE(8 ## n ## 6b, fault);	\
+	EX_TABLE(8 ## n ## 7b, fault);
 
 	.text
 	.stabs	"arch/powerpc/lib/",N_SO,0,0,0f
@@ -127,11 +125,8 @@ LG_CACHELINE_BYTES = L1_CACHE_SHIFT
 CACHELINE_MASK = (L1_CACHE_BYTES-1)
 
 _GLOBAL(csum_partial_copy_generic)
-	stwu	r1,-16(r1)
-	stw	r7,12(r1)
-	stw	r8,8(r1)
-
-	addic	r12,r6,0
+	li	r12,-1
+	addic	r0,r0,0			/* clear carry */
 	addi	r6,r4,-4
 	neg	r0,r4
 	addi	r4,r3,-4
@@ -246,34 +241,19 @@ _GLOBAL(csum_partial_copy_generic)
 	rlwinm	r3,r3,8,0,31	/* odd destination address: rotate one byte */
 	blr
 
-/* read fault */
-src_error:
-	lwz	r7,12(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r7,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r7)
-	blr
-/* write fault */
-dst_error:
-	lwz	r8,8(r1)
-	addi	r1,r1,16
-	cmpwi	cr0,r8,0
-	beqlr
-	li	r0,-EFAULT
-	stw	r0,0(r8)
+fault:
+	li	r3,0
 	blr
 
-	EX_TABLE(70b, src_error);
-	EX_TABLE(71b, dst_error);
-	EX_TABLE(72b, src_error);
-	EX_TABLE(73b, dst_error);
-	EX_TABLE(54b, dst_error);
+	EX_TABLE(70b, fault);
+	EX_TABLE(71b, fault);
+	EX_TABLE(72b, fault);
+	EX_TABLE(73b, fault);
+	EX_TABLE(54b, fault);
 
 /*
  * this stuff handles faults in the cacheline loop and branches to either
- * src_error (if in read part) or dst_error (if in write part)
+ * fault (if in read part) or fault (if in write part)
  */
 	CSUM_COPY_16_BYTES_EXCODE(0)
 #if L1_CACHE_BYTES >= 32
@@ -290,12 +270,12 @@ dst_error:
 #endif
 #endif
 
-	EX_TABLE(30b, src_error);
-	EX_TABLE(31b, dst_error);
-	EX_TABLE(40b, src_error);
-	EX_TABLE(41b, dst_error);
-	EX_TABLE(50b, src_error);
-	EX_TABLE(51b, dst_error);
+	EX_TABLE(30b, fault);
+	EX_TABLE(31b, fault);
+	EX_TABLE(40b, fault);
+	EX_TABLE(41b, fault);
+	EX_TABLE(50b, fault);
+	EX_TABLE(51b, fault);
 
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
diff --git a/arch/powerpc/lib/checksum_64.S b/arch/powerpc/lib/checksum_64.S
index 514978f908d4..98ff51bd2f7d 100644
--- a/arch/powerpc/lib/checksum_64.S
+++ b/arch/powerpc/lib/checksum_64.S
@@ -182,34 +182,33 @@ EXPORT_SYMBOL(__csum_partial)
 
 	.macro srcnr
 100:
-	EX_TABLE(100b,.Lsrc_error_nr)
+	EX_TABLE(100b,.Lerror_nr)
 	.endm
 
 	.macro source
 150:
-	EX_TABLE(150b,.Lsrc_error)
+	EX_TABLE(150b,.Lerror)
 	.endm
 
 	.macro dstnr
 200:
-	EX_TABLE(200b,.Ldest_error_nr)
+	EX_TABLE(200b,.Lerror_nr)
 	.endm
 
 	.macro dest
 250:
-	EX_TABLE(250b,.Ldest_error)
+	EX_TABLE(250b,.Lerror)
 	.endm
 
 /*
  * Computes the checksum of a memory block at src, length len,
- * and adds in "sum" (32-bit), while copying the block to dst.
- * If an access exception occurs on src or dst, it stores -EFAULT
- * to *src_err or *dst_err respectively. The caller must take any action
- * required in this case (zeroing memory, recalculating partial checksum etc).
+ * and adds in 0xffffffff (32-bit), while copying the block to dst.
+ * If an access exception occurs, it returns 0.
  *
- * csum_partial_copy_generic(r3=src, r4=dst, r5=len, r6=sum, r7=src_err, r8=dst_err)
+ * csum_partial_copy_generic(r3=src, r4=dst, r5=len)
  */
 _GLOBAL(csum_partial_copy_generic)
+	li	r6,-1
 	addic	r0,r6,0			/* clear carry */
 
 	srdi.	r6,r5,3			/* less than 8 bytes? */
@@ -401,29 +400,15 @@ dstnr;	stb	r6,0(r4)
 	srdi	r3,r3,32
 	blr
 
-.Lsrc_error:
+.Lerror:
 	ld	r14,STK_REG(R14)(r1)
 	ld	r15,STK_REG(R15)(r1)
 	ld	r16,STK_REG(R16)(r1)
 	addi	r1,r1,STACKFRAMESIZE
-.Lsrc_error_nr:
-	cmpdi	0,r7,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r7)
+.Lerror_nr:
+	li	r3,0
 	blr
 
-.Ldest_error:
-	ld	r14,STK_REG(R14)(r1)
-	ld	r15,STK_REG(R15)(r1)
-	ld	r16,STK_REG(R16)(r1)
-	addi	r1,r1,STACKFRAMESIZE
-.Ldest_error_nr:
-	cmpdi	0,r8,0
-	beqlr
-	li	r6,-EFAULT
-	stw	r6,0(r8)
-	blr
 EXPORT_SYMBOL(csum_partial_copy_generic)
 
 /*
diff --git a/arch/powerpc/lib/checksum_wrappers.c b/arch/powerpc/lib/checksum_wrappers.c
index b1faa82dd8af..b895166afc82 100644
--- a/arch/powerpc/lib/checksum_wrappers.c
+++ b/arch/powerpc/lib/checksum_wrappers.c
@@ -14,8 +14,7 @@
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 			       int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 
@@ -24,27 +23,16 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 
 	allow_read_from_user(src, len);
 
-	csum = csum_partial_copy_generic((void __force *)src, dst,
-					 len, ~0U, &err, NULL);
-
-	if (unlikely(err)) {
-		int missing = __copy_from_user(dst, src, len);
-
-		if (missing)
-			csum = 0;
-		else
-			csum = csum_partial(dst, len, ~0U);
-	}
+	csum = csum_partial_copy_generic((void __force *)src, dst, len);
 
 	prevent_read_from_user(src, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_from_user);
 
 __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 {
-	unsigned int csum;
-	int err = 0;
+	__wsum csum;
 
 	might_sleep();
 	if (unlikely(!access_ok(dst, len)))
@@ -52,17 +40,9 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len)
 
 	allow_write_to_user(dst, len);
 
-	csum = csum_partial_copy_generic(src, (void __force *)dst,
-					 len, ~0U, NULL, &err);
-
-	if (unlikely(err)) {
-		csum = csum_partial(src, len, ~0U);
-
-		if (copy_to_user(dst, src, len))
-			csum = 0;
-	}
+	csum = csum_partial_copy_generic(src, (void __force *)dst, len);
 
 	prevent_write_to_user(dst, len);
-	return (__force __wsum)csum;
+	return csum;
 }
 EXPORT_SYMBOL(csum_and_copy_to_user);
-- 
2.11.0

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck()
  2020-07-24  1:25     ` [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck() Al Viro
@ 2020-07-24  6:41       ` Christoph Hellwig
  2020-07-24 12:19         ` Al Viro
  0 siblings, 1 reply; 102+ messages in thread
From: Christoph Hellwig @ 2020-07-24  6:41 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Fri, Jul 24, 2020 at 02:25:30AM +0100, Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> quite a few architectures have the same csum_partial_copy_nocheck() -
> simply memcpy() the data and then return the csum of the copy.
> 
> hexagon, parisc, ia64, s390, um: explicitly spelled out that way.
> 
> arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
> nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
> out in lib/checksum.h (with varying amounts of perversions along the way).
> 
> everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
> sh, sparc, x86, xtensa) have non-generic variants.  For all except c6x
> the declaration is in their asm/checksum.h.  c6x uses the wrapper
> from asm-generic/checksum.h that would normally lead to the lib/checksum.h
> instance, but in case of c6x we end up using an asm function from arch/c6x
> instead.
> 
> Screw that mess - have architectures with private instances define
> _HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
> one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
> *not* defined.

net-next has a patch from me killing off csum_and_copy_from_user
already:

https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f1bfd71c8662f20d53e71ef4e18bfb0e5677c27f

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck()
  2020-07-24  6:41       ` Christoph Hellwig
@ 2020-07-24 12:19         ` Al Viro
  2020-07-24 12:23           ` Christoph Hellwig
  0 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24 12:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Fri, Jul 24, 2020 at 07:41:17AM +0100, Christoph Hellwig wrote:
> On Fri, Jul 24, 2020 at 02:25:30AM +0100, Al Viro wrote:
> > From: Al Viro <viro@zeniv.linux.org.uk>
> > 
> > quite a few architectures have the same csum_partial_copy_nocheck() -
> > simply memcpy() the data and then return the csum of the copy.
> > 
> > hexagon, parisc, ia64, s390, um: explicitly spelled out that way.
> > 
> > arc, arm64, csky, h8300, m68k/nommu, microblaze, mips/GENERIC_CSUM, nds32,
> > nios2, openrisc, riscv, unicore32: end up picking the same thing spelled
> > out in lib/checksum.h (with varying amounts of perversions along the way).
> > 
> > everybody else (alpha, arm, c6x, m68k/mmu, mips/!GENERIC_CSUM, powerpc,
> > sh, sparc, x86, xtensa) have non-generic variants.  For all except c6x
> > the declaration is in their asm/checksum.h.  c6x uses the wrapper
> > from asm-generic/checksum.h that would normally lead to the lib/checksum.h
> > instance, but in case of c6x we end up using an asm function from arch/c6x
> > instead.
> > 
> > Screw that mess - have architectures with private instances define
> > _HAVE_ARCH_CSUM_AND_COPY in their asm/checksum.h and have the default
> > one right in net/checksum.h conditional on _HAVE_ARCH_CSUM_AND_COPY
> > *not* defined.
> 
> net-next has a patch from me killing off csum_and_copy_from_user
> already:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f1bfd71c8662f20d53e71ef4e18bfb0e5677c27f

Nothing in that patch of yours touches csum_and_copy_from_user(). what
are you talking about?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck()
  2020-07-24 12:19         ` Al Viro
@ 2020-07-24 12:23           ` Christoph Hellwig
  2020-07-24 12:30             ` Al Viro
  0 siblings, 1 reply; 102+ messages in thread
From: Christoph Hellwig @ 2020-07-24 12:23 UTC (permalink / raw)
  To: Al Viro; +Cc: Christoph Hellwig, Linus Torvalds, linux-kernel, linux-arch

On Fri, Jul 24, 2020 at 01:19:18PM +0100, Al Viro wrote:
> > net-next has a patch from me killing off csum_and_copy_from_user
> > already:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=f1bfd71c8662f20d53e71ef4e18bfb0e5677c27f
> 
> Nothing in that patch of yours touches csum_and_copy_from_user(). what
> are you talking about?

Sorry, I meant csum_and_copy_from_nocheck, just as in this patch.

Merging your branch into the net-next tree thus will conflict in
the nios2 and asm-geneeric/checksum.h as well as lib/checksum.c.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: [PATCH v2 04/20] unify generic instances of csum_partial_copy_nocheck()
  2020-07-24 12:23           ` Christoph Hellwig
@ 2020-07-24 12:30             ` Al Viro
  2020-07-26  7:11               ` Christoph Hellwig
  0 siblings, 1 reply; 102+ messages in thread
From: Al Viro @ 2020-07-24 12:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, linux-arch

On Fri, Jul 24, 2020 at 01:23:37PM +0100, Christo