All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] fcntl, sockopt, and ioctl options
@ 2020-07-23  0:19 Shu-Chun Weng
  2020-07-23  0:19 ` [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls Shu-Chun Weng
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent

Hi Laurent,

This is a series of 6 patches in 4 groups, putting into a single thread for
easier tracking.

[PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls
  An incidental follow up on
  https://lists.nongnu.org/archive/html/qemu-devel/2019-09/msg01925.html

[PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt
  Updated https://lists.nongnu.org/archive/html/qemu-devel/2019-09/msg01317.html
  to consistently add them in get/setsockopt

[PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW
[PATCH 4/6] linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING
  Updated https://lists.nongnu.org/archive/html/qemu-devel/2019-09/msg01319.html
  to only use TARGET_SO_*_OLD/NEW

[PATCH 5/6] thunk: supports flexible arrays
[PATCH 6/6] linux-user: Add support for SIOCETHTOOL ioctl
  Updated https://lists.nongnu.org/archive/html/qemu-devel/2019-08/msg05090.html

Shu-Chun Weng (6):
  linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls
  linux-user: add missing UDP and IPv6 get/setsockopt options
  linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW
  linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING
  thunk: supports flexible arrays
  linux-user: Add support for SIOCETHTOOL ioctl

 include/exec/user/thunk.h              |  20 +
 linux-user/Makefile.objs               |   3 +-
 linux-user/alpha/sockbits.h            |  21 +-
 linux-user/ethtool.c                   | 819 +++++++++++++++++++++++++
 linux-user/ethtool.h                   |  19 +
 linux-user/ethtool_entries.h           | 107 ++++
 linux-user/fd-trans.h                  |  41 +-
 linux-user/generic/sockbits.h          |  17 +-
 linux-user/hppa/sockbits.h             |  20 +-
 linux-user/ioctls.h                    |   2 +
 linux-user/mips/sockbits.h             |  16 +-
 linux-user/qemu.h                      |   1 +
 linux-user/sparc/sockbits.h            |  21 +-
 linux-user/strace.c                    |  19 +-
 linux-user/syscall.c                   | 233 ++++++-
 linux-user/syscall_defs.h              |  26 +-
 linux-user/syscall_types.h             | 277 +++++++++
 tests/tcg/multiarch/ethtool.c          | 417 +++++++++++++
 tests/tcg/multiarch/socket_timestamp.c | 542 ++++++++++++++++
 thunk.c                                | 151 ++++-
 20 files changed, 2706 insertions(+), 66 deletions(-)
 create mode 100644 linux-user/ethtool.c
 create mode 100644 linux-user/ethtool.h
 create mode 100644 linux-user/ethtool_entries.h
 create mode 100644 tests/tcg/multiarch/ethtool.c
 create mode 100644 tests/tcg/multiarch/socket_timestamp.c

-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
@ 2020-07-23  0:19 ` Shu-Chun Weng
  2020-08-07 11:44   ` Laurent Vivier
  2020-07-23  0:19 ` [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt options Shu-Chun Weng
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent

Signed-off-by: Shu-Chun Weng <scw@google.com>
---
 linux-user/syscall.c      | 10 ++++++++++
 linux-user/syscall_defs.h | 14 ++++++++------
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 1211e759c2..f97337b0b4 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6312,6 +6312,14 @@ static int target_to_host_fcntl_cmd(int cmd)
     case TARGET_F_GETPIPE_SZ:
         ret = F_GETPIPE_SZ;
         break;
+#endif
+#ifdef F_ADD_SEALS
+    case TARGET_F_ADD_SEALS:
+        ret = F_ADD_SEALS;
+        break;
+    case TARGET_F_GET_SEALS:
+        ret = F_GET_SEALS;
+        break;
 #endif
     default:
         ret = -TARGET_EINVAL;
@@ -6598,6 +6606,8 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
     case TARGET_F_GETLEASE:
     case TARGET_F_SETPIPE_SZ:
     case TARGET_F_GETPIPE_SZ:
+    case TARGET_F_ADD_SEALS:
+    case TARGET_F_GET_SEALS:
         ret = get_errno(safe_fcntl(fd, host_cmd, arg));
         break;
 
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 3c261cff0e..70df1a94fb 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -2292,12 +2292,14 @@ struct target_statfs64 {
 #endif
 
 #define TARGET_F_LINUX_SPECIFIC_BASE 1024
-#define TARGET_F_SETLEASE (TARGET_F_LINUX_SPECIFIC_BASE + 0)
-#define TARGET_F_GETLEASE (TARGET_F_LINUX_SPECIFIC_BASE + 1)
-#define TARGET_F_DUPFD_CLOEXEC (TARGET_F_LINUX_SPECIFIC_BASE + 6)
-#define TARGET_F_SETPIPE_SZ (TARGET_F_LINUX_SPECIFIC_BASE + 7)
-#define TARGET_F_GETPIPE_SZ (TARGET_F_LINUX_SPECIFIC_BASE + 8)
-#define TARGET_F_NOTIFY  (TARGET_F_LINUX_SPECIFIC_BASE+2)
+#define TARGET_F_SETLEASE            (TARGET_F_LINUX_SPECIFIC_BASE + 0)
+#define TARGET_F_GETLEASE            (TARGET_F_LINUX_SPECIFIC_BASE + 1)
+#define TARGET_F_DUPFD_CLOEXEC       (TARGET_F_LINUX_SPECIFIC_BASE + 6)
+#define TARGET_F_NOTIFY              (TARGET_F_LINUX_SPECIFIC_BASE + 2)
+#define TARGET_F_SETPIPE_SZ          (TARGET_F_LINUX_SPECIFIC_BASE + 7)
+#define TARGET_F_GETPIPE_SZ          (TARGET_F_LINUX_SPECIFIC_BASE + 8)
+#define TARGET_F_ADD_SEALS           (TARGET_F_LINUX_SPECIFIC_BASE + 9)
+#define TARGET_F_GET_SEALS           (TARGET_F_LINUX_SPECIFIC_BASE + 10)
 
 #include "target_fcntl.h"
 
-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt options
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
  2020-07-23  0:19 ` [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls Shu-Chun Weng
@ 2020-07-23  0:19 ` Shu-Chun Weng
  2020-08-07 11:54   ` Laurent Vivier
  2020-07-23  0:19 ` [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW Shu-Chun Weng
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent

UDP: SOL_UDP manipulate options at UDP level. All six options currently
defined in linux source include/uapi/linux/udp.h take integer values.

IPv6: IPV6_ADDR_PREFERENCES (RFC5014: Source address selection) was not
supported.

Signed-off-by: Shu-Chun Weng <scw@google.com>
---
 linux-user/syscall.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f97337b0b4..a53db446d4 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -51,8 +51,10 @@
 #include <sys/sysinfo.h>
 #include <sys/signalfd.h>
 //#include <sys/user.h>
+#include <netinet/in.h>
 #include <netinet/ip.h>
 #include <netinet/tcp.h>
+#include <netinet/udp.h>
 #include <linux/wireless.h>
 #include <linux/icmp.h>
 #include <linux/icmpv6.h>
@@ -1945,7 +1947,8 @@ static abi_long do_setsockopt(int sockfd, int level, int optname,
 
     switch(level) {
     case SOL_TCP:
-        /* TCP options all take an 'int' value.  */
+    case SOL_UDP:
+        /* TCP and UDP options all take an 'int' value.  */
         if (optlen < sizeof(uint32_t))
             return -TARGET_EINVAL;
 
@@ -2031,6 +2034,7 @@ static abi_long do_setsockopt(int sockfd, int level, int optname,
         case IPV6_RECVDSTOPTS:
         case IPV6_2292DSTOPTS:
         case IPV6_TCLASS:
+        case IPV6_ADDR_PREFERENCES:
 #ifdef IPV6_RECVPATHMTU
         case IPV6_RECVPATHMTU:
 #endif
@@ -2593,7 +2597,8 @@ get_timeout:
         }
         break;
     case SOL_TCP:
-        /* TCP options all take an 'int' value.  */
+    case SOL_UDP:
+        /* TCP and UDP options all take an 'int' value.  */
     int_case:
         if (get_user_u32(len, optlen))
             return -TARGET_EFAULT;
@@ -2684,6 +2689,7 @@ get_timeout:
         case IPV6_RECVDSTOPTS:
         case IPV6_2292DSTOPTS:
         case IPV6_TCLASS:
+        case IPV6_ADDR_PREFERENCES:
 #ifdef IPV6_RECVPATHMTU
         case IPV6_RECVPATHMTU:
 #endif
-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
  2020-07-23  0:19 ` [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls Shu-Chun Weng
  2020-07-23  0:19 ` [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt options Shu-Chun Weng
@ 2020-07-23  0:19 ` Shu-Chun Weng
  2020-08-07 12:07   ` Laurent Vivier
  2020-07-23  0:19 ` [PATCH 4/6] linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING Shu-Chun Weng
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent, alex.bennee

Both guest options map to host SO_TIMESTAMP while keeping a bit in
fd_trans to remember if the guest expects the old or the new format.

Added a multiarch test to verify.

Signed-off-by: Shu-Chun Weng <scw@google.com>
---
 linux-user/alpha/sockbits.h            |   8 +-
 linux-user/fd-trans.h                  |  41 +++-
 linux-user/generic/sockbits.h          |   9 +-
 linux-user/hppa/sockbits.h             |   8 +-
 linux-user/mips/sockbits.h             |   8 +-
 linux-user/sparc/sockbits.h            |   8 +-
 linux-user/strace.c                    |   7 +-
 linux-user/syscall.c                   |  69 ++++--
 tests/tcg/multiarch/socket_timestamp.c | 292 +++++++++++++++++++++++++
 9 files changed, 419 insertions(+), 31 deletions(-)
 create mode 100644 tests/tcg/multiarch/socket_timestamp.c

diff --git a/linux-user/alpha/sockbits.h b/linux-user/alpha/sockbits.h
index d54dc98c09..40f0644df0 100644
--- a/linux-user/alpha/sockbits.h
+++ b/linux-user/alpha/sockbits.h
@@ -48,8 +48,6 @@
 #define TARGET_SO_DETACH_FILTER        27
 
 #define TARGET_SO_PEERNAME      28
-#define TARGET_SO_TIMESTAMP     29
-#define TARGET_SCM_TIMESTAMP        TARGET_SO_TIMESTAMP
 
 #define TARGET_SO_PEERSEC       30
 #define TARGET_SO_PASSSEC       34
@@ -75,6 +73,12 @@
 /* Instruct lower device to use last 4-bytes of skb data as FCS */
 #define TARGET_SO_NOFCS     43
 
+#define TARGET_SO_TIMESTAMP_OLD        29
+#define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+
+#define TARGET_SO_TIMESTAMP_NEW        63
+#define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+
 /* TARGET_O_NONBLOCK clashes with the bits used for socket types.  Therefore we
  * have to define SOCK_NONBLOCK to a different value here.
  */
diff --git a/linux-user/fd-trans.h b/linux-user/fd-trans.h
index a3fcdaabc7..8ab650dfd2 100644
--- a/linux-user/fd-trans.h
+++ b/linux-user/fd-trans.h
@@ -22,6 +22,16 @@ typedef struct TargetFdTrans {
     TargetFdDataFunc host_to_target_data;
     TargetFdDataFunc target_to_host_data;
     TargetFdAddrFunc target_to_host_addr;
+
+    /* If `true`, this struct is dynamically allocated and should be
+     * `g_free()`ed when unregistering.
+     */
+    bool free_when_unregister;
+
+    /* The socket's timestamp option (`SO_TIMESTAMP`, `SO_TIMESTAMPNS`, and
+     * `SO_TIMESTAMPING`) is using the `_NEW` version.
+     */
+    bool socket_timestamp_new;
 } TargetFdTrans;
 
 extern TargetFdTrans **target_fd_trans;
@@ -52,6 +62,14 @@ static inline TargetFdAddrFunc fd_trans_target_to_host_addr(int fd)
     return NULL;
 }
 
+static inline bool fd_trans_socket_timestamp_new(int fd)
+{
+    if (fd >= 0 && fd < target_fd_max && target_fd_trans[fd]) {
+        return target_fd_trans[fd]->socket_timestamp_new;
+    }
+    return false;
+}
+
 static inline void fd_trans_register(int fd, TargetFdTrans *trans)
 {
     unsigned int oldmax;
@@ -70,6 +88,9 @@ static inline void fd_trans_register(int fd, TargetFdTrans *trans)
 static inline void fd_trans_unregister(int fd)
 {
     if (fd >= 0 && fd < target_fd_max) {
+        if (target_fd_trans[fd] && target_fd_trans[fd]->free_when_unregister) {
+            g_free(target_fd_trans[fd]);
+        }
         target_fd_trans[fd] = NULL;
     }
 }
@@ -78,8 +99,26 @@ static inline void fd_trans_dup(int oldfd, int newfd)
 {
     fd_trans_unregister(newfd);
     if (oldfd < target_fd_max && target_fd_trans[oldfd]) {
-        fd_trans_register(newfd, target_fd_trans[oldfd]);
+        TargetFdTrans *trans = target_fd_trans[oldfd];
+        if (trans->free_when_unregister) {
+            trans = g_new(TargetFdTrans, 1);
+            *trans = *target_fd_trans[oldfd];
+        }
+        fd_trans_register(newfd, trans);
+    }
+}
+
+static inline void fd_trans_mark_socket_timestamp_new(int fd, bool value)
+{
+    if (fd < 0) return;
+    if (fd >= target_fd_max || target_fd_trans[fd] == NULL) {
+        if (!value) return; /* default is false */
+
+        TargetFdTrans* trans = g_new0(TargetFdTrans, 1);
+        trans->free_when_unregister = true;
+        fd_trans_register(fd, trans);
     }
+    target_fd_trans[fd]->socket_timestamp_new = value;
 }
 
 extern TargetFdTrans target_packet_trans;
diff --git a/linux-user/generic/sockbits.h b/linux-user/generic/sockbits.h
index e44733c601..532cf2d3dc 100644
--- a/linux-user/generic/sockbits.h
+++ b/linux-user/generic/sockbits.h
@@ -49,10 +49,15 @@
 #define TARGET_SO_DETACH_FILTER        27
 
 #define TARGET_SO_PEERNAME             28
-#define TARGET_SO_TIMESTAMP            29
-#define TARGET_SCM_TIMESTAMP           TARGET_SO_TIMESTAMP
 
 #define TARGET_SO_ACCEPTCONN           30
 
 #define TARGET_SO_PEERSEC              31
+
+#define TARGET_SO_TIMESTAMP_OLD        29
+#define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+
+#define TARGET_SO_TIMESTAMP_NEW        63
+#define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+
 #endif
diff --git a/linux-user/hppa/sockbits.h b/linux-user/hppa/sockbits.h
index 23f69a3293..284a47e74e 100644
--- a/linux-user/hppa/sockbits.h
+++ b/linux-user/hppa/sockbits.h
@@ -29,8 +29,6 @@
 #define TARGET_SO_BSDCOMPAT    0x400e
 #define TARGET_SO_PASSCRED     0x4010
 #define TARGET_SO_PEERCRED     0x4011
-#define TARGET_SO_TIMESTAMP    0x4012
-#define TARGET_SCM_TIMESTAMP   TARGET_SO_TIMESTAMP
 #define TARGET_SO_TIMESTAMPNS  0x4013
 #define TARGET_SCM_TIMESTAMPNS TARGET_SO_TIMESTAMPNS
 
@@ -67,6 +65,12 @@
 
 #define TARGET_SO_CNX_ADVICE           0x402E
 
+#define TARGET_SO_TIMESTAMP_OLD        0x4012
+#define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+
+#define TARGET_SO_TIMESTAMP_NEW        0x4038
+#define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+
 /* TARGET_O_NONBLOCK clashes with the bits used for socket types.  Therefore we
  * have to define SOCK_NONBLOCK to a different value here.
  */
diff --git a/linux-user/mips/sockbits.h b/linux-user/mips/sockbits.h
index 0f022cd598..b4c39d9588 100644
--- a/linux-user/mips/sockbits.h
+++ b/linux-user/mips/sockbits.h
@@ -61,14 +61,18 @@
 #define TARGET_SO_DETACH_FILTER        27
 
 #define TARGET_SO_PEERNAME             28
-#define TARGET_SO_TIMESTAMP            29
-#define SCM_TIMESTAMP          SO_TIMESTAMP
 
 #define TARGET_SO_PEERSEC              30
 #define TARGET_SO_SNDBUFFORCE          31
 #define TARGET_SO_RCVBUFFORCE          33
 #define TARGET_SO_PASSSEC              34
 
+#define TARGET_SO_TIMESTAMP_OLD        29
+#define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+
+#define TARGET_SO_TIMESTAMP_NEW        63
+#define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+
 /** sock_type - Socket types
  *
  * Please notice that for binary compat reasons MIPS has to
diff --git a/linux-user/sparc/sockbits.h b/linux-user/sparc/sockbits.h
index 0a822e3e1f..07440efd14 100644
--- a/linux-user/sparc/sockbits.h
+++ b/linux-user/sparc/sockbits.h
@@ -48,8 +48,6 @@
 #define TARGET_SO_GET_FILTER           TARGET_SO_ATTACH_FILTER
 
 #define TARGET_SO_PEERNAME             0x001c
-#define TARGET_SO_TIMESTAMP            0x001d
-#define TARGET_SCM_TIMESTAMP           TARGET_SO_TIMESTAMP
 
 #define TARGET_SO_PEERSEC              0x001e
 #define TARGET_SO_PASSSEC              0x001f
@@ -104,6 +102,12 @@
 
 #define TARGET_SO_ZEROCOPY             0x003e
 
+#define TARGET_SO_TIMESTAMP_OLD        0x001d
+#define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+
+#define TARGET_SO_TIMESTAMP_NEW        0x0046
+#define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define TARGET_SO_SECURITY_AUTHENTICATION              0x5001
 #define TARGET_SO_SECURITY_ENCRYPTION_TRANSPORT        0x5002
diff --git a/linux-user/strace.c b/linux-user/strace.c
index 13981341b3..369cfcd7bd 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -2213,8 +2213,11 @@ print_optint:
         case TARGET_SO_PASSCRED:
             qemu_log("SO_PASSCRED,");
             goto print_optint;
-        case TARGET_SO_TIMESTAMP:
-            qemu_log("SO_TIMESTAMP,");
+        case TARGET_SO_TIMESTAMP_OLD:
+            qemu_log("SO_TIMESTAMP_OLD,");
+            goto print_optint;
+        case TARGET_SO_TIMESTAMP_NEW:
+            qemu_log("SO_TIMESTAMP_NEW,");
             goto print_optint;
         case TARGET_SO_RCVLOWAT:
             qemu_log("SO_RCVLOWAT,");
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index a53db446d4..7f95ccda7f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1705,7 +1705,8 @@ static inline abi_long target_to_host_cmsg(struct msghdr *msgh,
 }
 
 static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
-                                           struct msghdr *msgh)
+                                           struct msghdr *msgh,
+                                           bool target_uses_timeval64)
 {
     struct cmsghdr *cmsg = CMSG_FIRSTHDR(msgh);
     abi_long msg_controllen;
@@ -1754,8 +1755,14 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
         switch (cmsg->cmsg_level) {
         case SOL_SOCKET:
             switch (cmsg->cmsg_type) {
-            case SO_TIMESTAMP:
-                tgt_len = sizeof(struct target_timeval);
+            case SCM_TIMESTAMP:
+                if (target_uses_timeval64) {
+                    tgt_len = sizeof(struct target__kernel_sock_timeval);
+                    target_cmsg->cmsg_type = tswap32(TARGET_SCM_TIMESTAMP_NEW);
+                } else {
+                    tgt_len = sizeof(struct target_timeval);
+                    target_cmsg->cmsg_type = tswap32(TARGET_SCM_TIMESTAMP_OLD);
+                }
                 break;
             default:
                 break;
@@ -1789,20 +1796,32 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
                 }
                 break;
             }
-            case SO_TIMESTAMP:
+            case SCM_TIMESTAMP:
             {
                 struct timeval *tv = (struct timeval *)data;
-                struct target_timeval *target_tv =
-                    (struct target_timeval *)target_data;
-
-                if (len != sizeof(struct timeval) ||
-                    tgt_len != sizeof(struct target_timeval)) {
+                if (len != sizeof(struct timeval)) {
                     goto unimplemented;
                 }
 
-                /* copy struct timeval to target */
-                __put_user(tv->tv_sec, &target_tv->tv_sec);
-                __put_user(tv->tv_usec, &target_tv->tv_usec);
+                if (target_uses_timeval64) {
+                    struct target__kernel_sock_timeval *target_tv =
+                        (struct target__kernel_sock_timeval *)target_data;
+                    if (tgt_len != sizeof(struct target__kernel_sock_timeval)) {
+                        goto unimplemented;
+                    }
+
+                    __put_user(tv->tv_sec, &target_tv->tv_sec);
+                    __put_user(tv->tv_usec, &target_tv->tv_usec);
+                } else {
+                    struct target_timeval *target_tv =
+                        (struct target_timeval *)target_data;
+                    if (tgt_len != sizeof(struct target_timeval)) {
+                        goto unimplemented;
+                    }
+
+                    __put_user(tv->tv_sec, &target_tv->tv_sec);
+                    __put_user(tv->tv_usec, &target_tv->tv_usec);
+                }
                 break;
             }
             case SCM_CREDENTIALS:
@@ -1941,7 +1960,7 @@ static abi_long do_setsockopt(int sockfd, int level, int optname,
                               abi_ulong optval_addr, socklen_t optlen)
 {
     abi_long ret;
-    int val;
+    int val, timestamp_format_is_new;
     struct ip_mreqn *ip_mreq;
     struct ip_mreq_source *ip_mreq_source;
 
@@ -2338,9 +2357,11 @@ set_timeout:
         case TARGET_SO_PASSSEC:
                 optname = SO_PASSSEC;
                 break;
-        case TARGET_SO_TIMESTAMP:
-		optname = SO_TIMESTAMP;
-		break;
+        case TARGET_SO_TIMESTAMP_OLD:
+        case TARGET_SO_TIMESTAMP_NEW:
+                timestamp_format_is_new = (optname == TARGET_SO_TIMESTAMP_NEW);
+                optname = SO_TIMESTAMP;
+                break;
         case TARGET_SO_RCVLOWAT:
 		optname = SO_RCVLOWAT;
 		break;
@@ -2353,6 +2374,9 @@ set_timeout:
 	if (get_user_u32(val, optval_addr))
             return -TARGET_EFAULT;
 	ret = get_errno(setsockopt(sockfd, SOL_SOCKET, optname, &val, sizeof(val)));
+        if (!is_error(ret) && optname == SO_TIMESTAMP) {
+            fd_trans_mark_socket_timestamp_new(sockfd, timestamp_format_is_new);
+        }
         break;
 #ifdef SOL_NETLINK
     case SOL_NETLINK:
@@ -2403,6 +2427,7 @@ static abi_long do_getsockopt(int sockfd, int level, int optname,
     abi_long ret;
     int len, val;
     socklen_t lv;
+    int timestamp_format_matches = 0;
 
     switch(level) {
     case TARGET_SOL_SOCKET:
@@ -2583,7 +2608,11 @@ get_timeout:
         case TARGET_SO_PASSCRED:
             optname = SO_PASSCRED;
             goto int_case;
-        case TARGET_SO_TIMESTAMP:
+        case TARGET_SO_TIMESTAMP_OLD:
+        case TARGET_SO_TIMESTAMP_NEW:
+            timestamp_format_matches =
+                (fd_trans_socket_timestamp_new(sockfd) ==
+                 (optname == TARGET_SO_TIMESTAMP_NEW));
             optname = SO_TIMESTAMP;
             goto int_case;
         case TARGET_SO_RCVLOWAT:
@@ -2611,6 +2640,9 @@ get_timeout:
         if (optname == SO_TYPE) {
             val = host_to_target_sock_type(val);
         }
+        if (optname == SO_TIMESTAMP) {
+            val = val && timestamp_format_matches;
+        }
         if (len > lv)
             len = lv;
         if (len == 4) {
@@ -3162,7 +3194,8 @@ static abi_long do_sendrecvmsg_locked(int fd, struct target_msghdr *msgp,
                 ret = fd_trans_host_to_target_data(fd)(msg.msg_iov->iov_base,
                                                MIN(msg.msg_iov->iov_len, len));
             } else {
-                ret = host_to_target_cmsg(msgp, &msg);
+                ret = host_to_target_cmsg(msgp, &msg,
+                                          fd_trans_socket_timestamp_new(fd));
             }
             if (!is_error(ret)) {
                 msgp->msg_namelen = tswap32(msg.msg_namelen);
diff --git a/tests/tcg/multiarch/socket_timestamp.c b/tests/tcg/multiarch/socket_timestamp.c
new file mode 100644
index 0000000000..fd2833e5c8
--- /dev/null
+++ b/tests/tcg/multiarch/socket_timestamp.c
@@ -0,0 +1,292 @@
+#include <assert.h>
+#include <errno.h>
+#include <linux/types.h>
+#include <netinet/in.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+#include <sys/time.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <unistd.h>
+
+#ifdef __kernel_old_timeval
+#define kernel_old_timeval __kernel_old_timeval
+#else
+struct kernel_old_timeval
+{
+    __kernel_long_t tv_sec;
+    __kernel_long_t tv_usec;
+};
+#endif
+
+struct kernel_sock_timeval
+{
+    int64_t tv_sec;
+    int64_t tv_usec;
+};
+
+int create_udp_socket(struct sockaddr_in *sockaddr)
+{
+    socklen_t sockaddr_len;
+    int sock = socket(AF_INET, SOCK_DGRAM, 0);
+    if (sock < 0) {
+        int err = errno;
+        fprintf(stderr, "Failed to create server socket: %s\n", strerror(err));
+        exit(err);
+    }
+
+    memset(sockaddr, 0, sizeof(*sockaddr));
+    sockaddr->sin_family = AF_INET;
+    sockaddr->sin_port = htons(0);  /* let kernel select a port for us */
+    sockaddr->sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+
+    if (bind(sock, (struct sockaddr *)sockaddr, sizeof(*sockaddr)) < 0) {
+        int err = errno;
+        fprintf(stderr, "Failed to bind server socket: %s\n", strerror(err));
+        exit(err);
+    }
+
+    sockaddr_len = sizeof(*sockaddr);
+    if (getsockname(sock, (struct sockaddr *)sockaddr, &sockaddr_len) < 0) {
+        int err = errno;
+        fprintf(stderr, "Failed to get socket name: %s\n", strerror(err));
+        exit(err);
+    }
+    return sock;
+}
+
+/* Checks that the timestamp in the message is not after the reception timestamp
+ * as well as the reception time is within 10 seconds of the message time.
+ */
+void check_timestamp_difference(const struct timeval *msg_tv,
+                                const struct timeval *pkt_tv)
+{
+    if (pkt_tv->tv_sec < msg_tv->tv_sec ||
+        (pkt_tv->tv_sec == msg_tv->tv_sec && pkt_tv->tv_usec < msg_tv->tv_usec))
+    {
+        fprintf(stderr,
+                "Packet received before sent: %lld.%06lld < %lld.%06lld\n",
+                (long long)pkt_tv->tv_sec, (long long)pkt_tv->tv_usec,
+                (long long)msg_tv->tv_sec, (long long)msg_tv->tv_usec);
+        exit(-1);
+    }
+
+    if (pkt_tv->tv_sec > msg_tv->tv_sec + 10 ||
+        (pkt_tv->tv_sec == msg_tv->tv_sec + 10 &&
+         pkt_tv->tv_usec > msg_tv->tv_usec)) {
+        fprintf(stderr,
+                "Packet received more than 10 seconds after sent: "
+                "%lld.%06lld > %lld.%06lld + 10\n",
+                (long long)pkt_tv->tv_sec, (long long)pkt_tv->tv_usec,
+                (long long)msg_tv->tv_sec, (long long)msg_tv->tv_usec);
+        exit(-1);
+    }
+}
+
+void send_current_time(int sock, struct sockaddr_in server_sockaddr)
+{
+    struct timeval tv = {0, 0};
+    gettimeofday(&tv, NULL);
+    sendto(sock, &tv, sizeof(tv), 0, (struct sockaddr *)&server_sockaddr,
+           sizeof(server_sockaddr));
+}
+
+typedef void (*get_timeval_t)(const struct cmsghdr *cmsg, struct timeval *tv);
+
+
+void receive_packet(int sock, get_timeval_t get_timeval)
+{
+    struct msghdr msg = {0};
+
+    char iobuf[1024];
+    struct iovec iov;
+
+    union {
+        /* 128 bytes should cover all imaginable timeval/timespec*/
+        char cmsg_buf[CMSG_SPACE(128)];
+        struct cmsghdr align;
+    } u;
+    struct cmsghdr *cmsg;
+    struct timeval msg_tv, pkt_tv;
+
+    int res;
+
+    iov.iov_base = iobuf;
+    iov.iov_len = sizeof(iobuf);
+
+    msg.msg_iov = &iov;
+    msg.msg_iovlen = 1;
+    msg.msg_control = (caddr_t)u.cmsg_buf;
+    msg.msg_controllen = sizeof(u.cmsg_buf);
+
+    if ((res = recvmsg(sock, &msg, 0)) < 0) {
+        int err = errno;
+        fprintf(stderr, "Failed to receive packet: %s\n", strerror(err));
+        exit(-err);
+    }
+
+    assert(res == sizeof(struct timeval));
+    assert(iov.iov_base == iobuf);
+    memcpy(&msg_tv, iov.iov_base, sizeof(msg_tv));
+    printf("Message timestamp: %lld.%06lld\n",
+           (long long)msg_tv.tv_sec, (long long)msg_tv.tv_usec);
+
+    cmsg = CMSG_FIRSTHDR(&msg);
+    assert(cmsg);
+    (*get_timeval)(cmsg, &pkt_tv);
+    printf("Packet timestamp: %lld.%06lld\n",
+           (long long)pkt_tv.tv_sec, (long long)pkt_tv.tv_usec);
+
+    check_timestamp_difference(&msg_tv, &pkt_tv);
+}
+
+void get_timeval_from_so_timestamp(const struct cmsghdr *cmsg,
+                                   struct timeval *tv)
+{
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SCM_TIMESTAMP);
+    assert(cmsg->cmsg_len >= CMSG_LEN(sizeof(struct timeval)));
+    memcpy(tv, CMSG_DATA(cmsg), sizeof(*tv));
+}
+
+#ifdef SO_TIMESTAMP_OLD
+void get_timeval_from_so_timestamp_old(const struct cmsghdr *cmsg,
+                                       struct timeval *tv)
+{
+    struct kernel_old_timeval old_tv;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SO_TIMESTAMP_OLD);
+    assert(cmsg->cmsg_len >= CMSG_LEN(sizeof(old_tv)));
+
+    memcpy(&old_tv, CMSG_DATA(cmsg), sizeof(old_tv));
+    tv->tv_sec = old_tv.tv_sec;
+    tv->tv_usec = old_tv.tv_usec;
+}
+
+#ifdef SO_TIMESTAMP_NEW
+void get_timeval_from_so_timestamp_new(const struct cmsghdr *cmsg,
+                                       struct timeval *tv)
+{
+    struct kernel_sock_timeval sock_tv;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SO_TIMESTAMP_NEW);
+    assert(cmsg->cmsg_len >= CMSG_LEN(sizeof(sock_tv)));
+
+    memcpy(&sock_tv, CMSG_DATA(cmsg), sizeof(sock_tv));
+    tv->tv_sec = sock_tv.tv_sec;
+    tv->tv_usec = sock_tv.tv_usec;
+}
+#endif /* defined(SO_TIMESTAMP_NEW) */
+#endif /* defined(SO_TIMESTAMP_OLD) */
+
+void set_socket_option(int sock, int sockopt, int on)
+{
+    socklen_t len;
+    int val = on;
+    if (setsockopt(sock, SOL_SOCKET, sockopt, &val, sizeof(val)) < 0) {
+        int err = errno;
+        fprintf(stderr, "Failed to setsockopt %d (%s): %s\n",
+                sockopt, on ? "on" : "off", strerror(err));
+        exit(err);
+    }
+
+    len = sizeof(val);
+    val = -1;
+    if (getsockopt(sock, SOL_SOCKET, sockopt, &val, &len) < 0) {
+        int err = errno;
+        fprintf(stderr, "Failed to getsockopt (%d): %s\n", sock, strerror(err));
+        exit(err);
+    }
+    assert(len == sizeof(val));
+    assert(val == on);
+}
+
+int main(int argc, char **argv)
+{
+    int parent_sock, child_sock;
+    struct sockaddr_in parent_sockaddr, child_sockaddr;
+    int pid;
+    struct timeval tv = {0, 0};
+    gettimeofday(&tv, NULL);
+
+    parent_sock = create_udp_socket(&parent_sockaddr);
+    child_sock = create_udp_socket(&child_sockaddr);
+
+    printf("Parent sock bound to port %d\nChild sock bound to port %d\n",
+           parent_sockaddr.sin_port, child_sockaddr.sin_port);
+
+    if ((pid = fork()) == -2) {
+        fprintf(stderr, "SKIPPED. Failed to fork: %s\n", strerror(errno));
+    } else if (pid == 0) {
+        close(child_sock);
+
+        /* Test 1: SO_TIMESTAMP */
+        send_current_time(parent_sock, child_sockaddr);
+
+        if (tv.tv_sec > 0x7fffff00) {
+            /* Too close to y2038 problem, old system may not work. */
+            close(parent_sock);
+            return 0;
+        }
+
+#ifdef SO_TIMESTAMP_OLD
+        if (SO_TIMESTAMP_OLD != SO_TIMESTAMP) {
+            /* Test 2a: SO_TIMESTAMP_OLD */
+            set_socket_option(parent_sock, SO_TIMESTAMP_OLD, 1);
+            receive_packet(parent_sock, &get_timeval_from_so_timestamp_old);
+            set_socket_option(parent_sock, SO_TIMESTAMP_OLD, 0);
+        }
+#ifdef SO_TIMESTAMP_NEW
+        else {
+            /* Test 2b: SO_TIMESTAMP_NEW */
+            set_socket_option(parent_sock, SO_TIMESTAMP_NEW, 1);
+            receive_packet(parent_sock, &get_timeval_from_so_timestamp_new);
+            set_socket_option(parent_sock, SO_TIMESTAMP_NEW, 0);
+        }
+#endif /* defined(SO_TIMESTAMP_NEW) */
+#endif /* defined(SO_TIMESTAMP_OLD) */
+
+        close(parent_sock);
+    } else {
+        int child_status;
+        close(parent_sock);
+
+        /* Test 1: SO_TIMESTAMP */
+        set_socket_option(child_sock, SO_TIMESTAMP, 1);
+        receive_packet(child_sock, &get_timeval_from_so_timestamp);
+        set_socket_option(child_sock, SO_TIMESTAMP, 0);
+
+        if (tv.tv_sec > 0x7fffff00) {
+            /* Too close to y2038 problem, old system may not work. */
+            close(child_sock);
+            return 0;
+        }
+
+#ifdef SO_TIMESTAMP_OLD
+        if (SO_TIMESTAMP_OLD != SO_TIMESTAMP) {
+            /* Test 2a: SO_TIMESTAMP_OLD */
+            send_current_time(child_sock, parent_sockaddr);
+        }
+#ifdef SO_TIMESTAMP_NEW
+        else {
+            /* Test 2b: SO_TIMESTAMP_NEW */
+            send_current_time(child_sock, parent_sockaddr);
+        }
+#endif /* defined(SO_TIMESTAMP_NEW) */
+#endif /* defined(SO_TIMESTAMP_OLD) */
+
+        close(child_sock);
+
+        if (waitpid(pid, &child_status, 0) < 0) {
+            int err = errno;
+            fprintf(stderr, "Final wait() failed: %s\n", strerror(err));
+            return err;
+        }
+        return child_status;
+    }
+    return 0;
+}
-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
                   ` (2 preceding siblings ...)
  2020-07-23  0:19 ` [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW Shu-Chun Weng
@ 2020-07-23  0:19 ` Shu-Chun Weng
  2020-07-23  0:19 ` [PATCH 5/6] thunk: supports flexible arrays Shu-Chun Weng
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent, alex.bennee

This change supports SO_TIMESTAMPNS_OLD/NEW and SO_TIMESTAMPING_OLD/NEW
for setsocketopt() with SOL_SOCKET. Based on the SO_TIMESTAMP_OLD/NEW
framework. The three pairs share the same flag `SOCK_TSTAMP_NEW` in
linux kernel for deciding if the old or the new format is used.

Signed-off-by: Shu-Chun Weng <scw@google.com>
---
 linux-user/alpha/sockbits.h            |  13 +-
 linux-user/generic/sockbits.h          |   8 +
 linux-user/hppa/sockbits.h             |  12 +-
 linux-user/mips/sockbits.h             |   8 +
 linux-user/sparc/sockbits.h            |  13 +-
 linux-user/strace.c                    |  12 +
 linux-user/syscall.c                   | 119 ++++++-
 tests/tcg/multiarch/socket_timestamp.c | 458 +++++++++++++++++++------
 8 files changed, 521 insertions(+), 122 deletions(-)

diff --git a/linux-user/alpha/sockbits.h b/linux-user/alpha/sockbits.h
index 40f0644df0..c2c88f432b 100644
--- a/linux-user/alpha/sockbits.h
+++ b/linux-user/alpha/sockbits.h
@@ -51,8 +51,6 @@
 
 #define TARGET_SO_PEERSEC       30
 #define TARGET_SO_PASSSEC       34
-#define TARGET_SO_TIMESTAMPNS       35
-#define TARGET_SCM_TIMESTAMPNS      TARGET_SO_TIMESTAMPNS
 
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define TARGET_SO_SECURITY_AUTHENTICATION       19
@@ -61,9 +59,6 @@
 
 #define TARGET_SO_MARK          36
 
-#define TARGET_SO_TIMESTAMPING      37
-#define TARGET_SCM_TIMESTAMPING TARGET_SO_TIMESTAMPING
-
 #define TARGET_SO_RXQ_OVFL             40
 
 #define TARGET_SO_WIFI_STATUS       41
@@ -75,9 +70,17 @@
 
 #define TARGET_SO_TIMESTAMP_OLD        29
 #define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+#define TARGET_SO_TIMESTAMPNS_OLD      35
+#define TARGET_SCM_TIMESTAMPNS_OLD     TARGET_SO_TIMESTAMPNS_OLD
+#define TARGET_SO_TIMESTAMPING_OLD     37
+#define TARGET_SCM_TIMESTAMPING_OLD    TARGET_SO_TIMESTAMPING_OLD
 
 #define TARGET_SO_TIMESTAMP_NEW        63
 #define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+#define TARGET_SO_TIMESTAMPNS_NEW      64
+#define TARGET_SCM_TIMESTAMPNS_NEW     TARGET_SO_TIMESTAMPNS_NEW
+#define TARGET_SO_TIMESTAMPING_NEW     65
+#define TARGET_SCM_TIMESTAMPING_NEW    TARGET_SO_TIMESTAMPING_NEW
 
 /* TARGET_O_NONBLOCK clashes with the bits used for socket types.  Therefore we
  * have to define SOCK_NONBLOCK to a different value here.
diff --git a/linux-user/generic/sockbits.h b/linux-user/generic/sockbits.h
index 532cf2d3dc..a0496d8751 100644
--- a/linux-user/generic/sockbits.h
+++ b/linux-user/generic/sockbits.h
@@ -56,8 +56,16 @@
 
 #define TARGET_SO_TIMESTAMP_OLD        29
 #define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+#define TARGET_SO_TIMESTAMPNS_OLD      35
+#define TARGET_SCM_TIMESTAMPNS_OLD     TARGET_SO_TIMESTAMPNS_OLD
+#define TARGET_SO_TIMESTAMPING_OLD     37
+#define TARGET_SCM_TIMESTAMPING_OLD    TARGET_SO_TIMESTAMPING_OLD
 
 #define TARGET_SO_TIMESTAMP_NEW        63
 #define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+#define TARGET_SO_TIMESTAMPNS_NEW      64
+#define TARGET_SCM_TIMESTAMPNS_NEW     TARGET_SO_TIMESTAMPNS_NEW
+#define TARGET_SO_TIMESTAMPING_NEW     65
+#define TARGET_SCM_TIMESTAMPING_NEW    TARGET_SO_TIMESTAMPING_NEW
 
 #endif
diff --git a/linux-user/hppa/sockbits.h b/linux-user/hppa/sockbits.h
index 284a47e74e..d7e9aa340d 100644
--- a/linux-user/hppa/sockbits.h
+++ b/linux-user/hppa/sockbits.h
@@ -29,8 +29,6 @@
 #define TARGET_SO_BSDCOMPAT    0x400e
 #define TARGET_SO_PASSCRED     0x4010
 #define TARGET_SO_PEERCRED     0x4011
-#define TARGET_SO_TIMESTAMPNS  0x4013
-#define TARGET_SCM_TIMESTAMPNS TARGET_SO_TIMESTAMPNS
 
 #define TARGET_SO_SECURITY_AUTHENTICATION              0x4016
 #define TARGET_SO_SECURITY_ENCRYPTION_TRANSPORT        0x4017
@@ -44,8 +42,6 @@
 #define TARGET_SO_PEERSEC              0x401d
 #define TARGET_SO_PASSSEC              0x401e
 #define TARGET_SO_MARK                 0x401f
-#define TARGET_SO_TIMESTAMPING         0x4020
-#define TARGET_SCM_TIMESTAMPING        TARGET_SO_TIMESTAMPING
 #define TARGET_SO_RXQ_OVFL             0x4021
 #define TARGET_SO_WIFI_STATUS          0x4022
 #define TARGET_SCM_WIFI_STATUS         TARGET_SO_WIFI_STATUS
@@ -67,9 +63,17 @@
 
 #define TARGET_SO_TIMESTAMP_OLD        0x4012
 #define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+#define TARGET_SO_TIMESTAMPNS_OLD      0x4013
+#define TARGET_SCM_TIMESTAMPNS_OLD     TARGET_SO_TIMESTAMPNS_OLD
+#define TARGET_SO_TIMESTAMPING_OLD     0x4020
+#define TARGET_SCM_TIMESTAMPING_OLD    TARGET_SO_TIMESTAMPING_OLD
 
 #define TARGET_SO_TIMESTAMP_NEW        0x4038
 #define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+#define TARGET_SO_TIMESTAMPNS_NEW      0x4039
+#define TARGET_SCM_TIMESTAMPNS_NEW     TARGET_SO_TIMESTAMPNS_NEW
+#define TARGET_SO_TIMESTAMPING_NEW     0x403A
+#define TARGET_SCM_TIMESTAMPING_NEW    TARGET_SO_TIMESTAMPING_NEW
 
 /* TARGET_O_NONBLOCK clashes with the bits used for socket types.  Therefore we
  * have to define SOCK_NONBLOCK to a different value here.
diff --git a/linux-user/mips/sockbits.h b/linux-user/mips/sockbits.h
index b4c39d9588..49524e23ac 100644
--- a/linux-user/mips/sockbits.h
+++ b/linux-user/mips/sockbits.h
@@ -69,9 +69,17 @@
 
 #define TARGET_SO_TIMESTAMP_OLD        29
 #define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+#define TARGET_SO_TIMESTAMPNS_OLD      35
+#define TARGET_SCM_TIMESTAMPNS_OLD     TARGET_SO_TIMESTAMPNS_OLD
+#define TARGET_SO_TIMESTAMPING_OLD     37
+#define TARGET_SCM_TIMESTAMPING_OLD    TARGET_SO_TIMESTAMPING_OLD
 
 #define TARGET_SO_TIMESTAMP_NEW        63
 #define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+#define TARGET_SO_TIMESTAMPNS_NEW      64
+#define TARGET_SCM_TIMESTAMPNS_NEW     TARGET_SO_TIMESTAMPNS_NEW
+#define TARGET_SO_TIMESTAMPING_NEW     65
+#define TARGET_SCM_TIMESTAMPING_NEW    TARGET_SO_TIMESTAMPING_NEW
 
 /** sock_type - Socket types
  *
diff --git a/linux-user/sparc/sockbits.h b/linux-user/sparc/sockbits.h
index 07440efd14..c5fade3ad1 100644
--- a/linux-user/sparc/sockbits.h
+++ b/linux-user/sparc/sockbits.h
@@ -51,14 +51,9 @@
 
 #define TARGET_SO_PEERSEC              0x001e
 #define TARGET_SO_PASSSEC              0x001f
-#define TARGET_SO_TIMESTAMPNS          0x0021
-#define TARGET_SCM_TIMESTAMPNS         TARGET_SO_TIMESTAMPNS
 
 #define TARGET_SO_MARK                 0x0022
 
-#define TARGET_SO_TIMESTAMPING         0x0023
-#define TARGET_SCM_TIMESTAMPING        TARGET_SO_TIMESTAMPING
-
 #define TARGET_SO_RXQ_OVFL             0x0024
 
 #define TARGET_SO_WIFI_STATUS          0x0025
@@ -104,9 +99,17 @@
 
 #define TARGET_SO_TIMESTAMP_OLD        0x001d
 #define TARGET_SCM_TIMESTAMP_OLD       TARGET_SO_TIMESTAMP_OLD
+#define TARGET_SO_TIMESTAMPNS_OLD      0x0021
+#define TARGET_SCM_TIMESTAMPNS_OLD     TARGET_SO_TIMESTAMPNS_OLD
+#define TARGET_SO_TIMESTAMPING_OLD     0x0023
+#define TARGET_SCM_TIMESTAMPING_OLD    TARGET_SO_TIMESTAMPING_OLD
 
 #define TARGET_SO_TIMESTAMP_NEW        0x0046
 #define TARGET_SCM_TIMESTAMP_NEW       TARGET_SO_TIMESTAMP_NEW
+#define TARGET_SO_TIMESTAMPNS_NEW      0x0042
+#define TARGET_SCM_TIMESTAMPNS_NEW     TARGET_SO_TIMESTAMPNS_NEW
+#define TARGET_SO_TIMESTAMPING_NEW     0x0043
+#define TARGET_SCM_TIMESTAMPING_NEW    TARGET_SO_TIMESTAMPING_NEW
 
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define TARGET_SO_SECURITY_AUTHENTICATION              0x5001
diff --git a/linux-user/strace.c b/linux-user/strace.c
index 369cfcd7bd..603723d6c8 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -2216,9 +2216,21 @@ print_optint:
         case TARGET_SO_TIMESTAMP_OLD:
             qemu_log("SO_TIMESTAMP_OLD,");
             goto print_optint;
+        case TARGET_SO_TIMESTAMPNS_OLD:
+            qemu_log("SO_TIMESTAMPNS_OLD,");
+            goto print_optint;
+        case TARGET_SO_TIMESTAMPING_OLD:
+            qemu_log("SO_TIMESTAMPING_OLD,");
+            goto print_optint;
         case TARGET_SO_TIMESTAMP_NEW:
             qemu_log("SO_TIMESTAMP_NEW,");
             goto print_optint;
+        case TARGET_SO_TIMESTAMPNS_NEW:
+            qemu_log("SO_TIMESTAMPNS_NEW,");
+            goto print_optint;
+        case TARGET_SO_TIMESTAMPING_NEW:
+            qemu_log("SO_TIMESTAMPING_NEW,");
+            goto print_optint;
         case TARGET_SO_RCVLOWAT:
             qemu_log("SO_RCVLOWAT,");
             goto print_optint;
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 7f95ccda7f..483f8ce48f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -1764,6 +1764,28 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
                     target_cmsg->cmsg_type = tswap32(TARGET_SCM_TIMESTAMP_OLD);
                 }
                 break;
+            case SCM_TIMESTAMPNS:
+                if (target_uses_timeval64) {
+                    tgt_len = sizeof(struct target__kernel_timespec);
+                    target_cmsg->cmsg_type =
+                        tswap32(TARGET_SCM_TIMESTAMPNS_NEW);
+                } else {
+                    tgt_len = sizeof(struct target_timespec);
+                    target_cmsg->cmsg_type =
+                        tswap32(TARGET_SCM_TIMESTAMPNS_OLD);
+                }
+                break;
+            case SCM_TIMESTAMPING:
+                if (target_uses_timeval64) {
+                    tgt_len = sizeof(struct target__kernel_timespec[3]);
+                    target_cmsg->cmsg_type =
+                        tswap32(TARGET_SCM_TIMESTAMPING_NEW);
+                } else {
+                    tgt_len = sizeof(struct target_timespec[3]);
+                    target_cmsg->cmsg_type =
+                        tswap32(TARGET_SCM_TIMESTAMPING_OLD);
+                }
+                break;
             default:
                 break;
             }
@@ -1824,6 +1846,67 @@ static inline abi_long host_to_target_cmsg(struct target_msghdr *target_msgh,
                 }
                 break;
             }
+            case SCM_TIMESTAMPNS:
+            {
+                struct timespec *ts = (struct timespec *)data;
+                if (len != sizeof(struct timespec)) {
+                    goto unimplemented;
+                }
+
+                if (target_uses_timeval64) {
+                    struct target__kernel_timespec *target_ts =
+                        (struct target__kernel_timespec *)target_data;
+                    if (tgt_len != sizeof(struct target__kernel_timespec)) {
+                        goto unimplemented;
+                    }
+
+                    __put_user(ts->tv_sec, &target_ts->tv_sec);
+                    __put_user(ts->tv_nsec, &target_ts->tv_nsec);
+                } else {
+                    struct target_timespec *target_ts =
+                        (struct target_timespec *)target_data;
+                    if (tgt_len != sizeof(struct target_timespec)) {
+                        goto unimplemented;
+                    }
+
+                    __put_user(ts->tv_sec, &target_ts->tv_sec);
+                    __put_user(ts->tv_nsec, &target_ts->tv_nsec);
+                }
+                break;
+            }
+            case SCM_TIMESTAMPING:
+            {
+                int i;
+                struct timespec *ts = (struct timespec *)data;
+                if (len != sizeof(struct timespec[3])) {
+                    goto unimplemented;
+                }
+
+                if (target_uses_timeval64) {
+                    struct target__kernel_timespec *target_ts =
+                        (struct target__kernel_timespec *)target_data;
+                    if (tgt_len != sizeof(struct target__kernel_timespec[3])) {
+                        goto unimplemented;
+                    }
+
+                    for (i = 0; i < 3; ++i) {
+                        __put_user(ts[i].tv_sec, &target_ts[i].tv_sec);
+                        __put_user(ts[i].tv_nsec, &target_ts[i].tv_nsec);
+                    }
+                } else {
+                    struct target_timespec *target_ts =
+                        (struct target_timespec *)target_data;
+                    if (tgt_len != sizeof(struct target_timespec[3])) {
+                        goto unimplemented;
+                    }
+
+                    for (i = 0; i < 3; ++i) {
+                        __put_user(ts[i].tv_sec, &target_ts[i].tv_sec);
+                        __put_user(ts[i].tv_nsec, &target_ts[i].tv_nsec);
+                    }
+                }
+                break;
+            }
             case SCM_CREDENTIALS:
             {
                 struct ucred *cred = (struct ucred *)data;
@@ -2362,6 +2445,18 @@ set_timeout:
                 timestamp_format_is_new = (optname == TARGET_SO_TIMESTAMP_NEW);
                 optname = SO_TIMESTAMP;
                 break;
+        case TARGET_SO_TIMESTAMPNS_OLD:
+        case TARGET_SO_TIMESTAMPNS_NEW:
+                timestamp_format_is_new =
+                    (optname == TARGET_SO_TIMESTAMPNS_NEW);
+                optname = SO_TIMESTAMPNS;
+                break;
+        case TARGET_SO_TIMESTAMPING_OLD:
+        case TARGET_SO_TIMESTAMPING_NEW:
+                timestamp_format_is_new =
+                    (optname == TARGET_SO_TIMESTAMPING_NEW);
+                optname = SO_TIMESTAMPING;
+                break;
         case TARGET_SO_RCVLOWAT:
 		optname = SO_RCVLOWAT;
 		break;
@@ -2374,7 +2469,9 @@ set_timeout:
 	if (get_user_u32(val, optval_addr))
             return -TARGET_EFAULT;
 	ret = get_errno(setsockopt(sockfd, SOL_SOCKET, optname, &val, sizeof(val)));
-        if (!is_error(ret) && optname == SO_TIMESTAMP) {
+        if (!is_error(ret) &&
+            (optname == SO_TIMESTAMP || optname == SO_TIMESTAMPNS ||
+             optname == SO_TIMESTAMPING)) {
             fd_trans_mark_socket_timestamp_new(sockfd, timestamp_format_is_new);
         }
         break;
@@ -2615,6 +2712,20 @@ get_timeout:
                  (optname == TARGET_SO_TIMESTAMP_NEW));
             optname = SO_TIMESTAMP;
             goto int_case;
+        case TARGET_SO_TIMESTAMPNS_OLD:
+        case TARGET_SO_TIMESTAMPNS_NEW:
+            timestamp_format_matches =
+                (fd_trans_socket_timestamp_new(sockfd) ==
+                 (optname == TARGET_SO_TIMESTAMPNS_NEW));
+            optname = SO_TIMESTAMPNS;
+            goto int_case;
+        case TARGET_SO_TIMESTAMPING_OLD:
+        case TARGET_SO_TIMESTAMPING_NEW:
+            timestamp_format_matches =
+                (fd_trans_socket_timestamp_new(sockfd) ==
+                 (optname == TARGET_SO_TIMESTAMPING_NEW));
+            optname = SO_TIMESTAMPING;
+            goto int_case;
         case TARGET_SO_RCVLOWAT:
             optname = SO_RCVLOWAT;
             goto int_case;
@@ -2639,9 +2750,9 @@ get_timeout:
             return ret;
         if (optname == SO_TYPE) {
             val = host_to_target_sock_type(val);
-        }
-        if (optname == SO_TIMESTAMP) {
-            val = val && timestamp_format_matches;
+        } else if ((optname == SO_TIMESTAMP || optname == SO_TIMESTAMPNS ||
+                    optname == SO_TIMESTAMPING) && !timestamp_format_matches) {
+            val = 0;
         }
         if (len > lv)
             len = lv;
diff --git a/tests/tcg/multiarch/socket_timestamp.c b/tests/tcg/multiarch/socket_timestamp.c
index fd2833e5c8..fe6059226b 100644
--- a/tests/tcg/multiarch/socket_timestamp.c
+++ b/tests/tcg/multiarch/socket_timestamp.c
@@ -1,5 +1,6 @@
 #include <assert.h>
 #include <errno.h>
+#include <linux/net_tstamp.h>
 #include <linux/types.h>
 #include <netinet/in.h>
 #include <stdint.h>
@@ -11,6 +12,7 @@
 #include <sys/time.h>
 #include <sys/types.h>
 #include <sys/wait.h>
+#include <time.h>
 #include <unistd.h>
 
 #ifdef __kernel_old_timeval
@@ -29,6 +31,38 @@ struct kernel_sock_timeval
     int64_t tv_usec;
 };
 
+struct kernel_old_timespec
+{
+    __kernel_long_t tv_sec;
+    long            tv_nsec;
+};
+
+struct kernel_timespec
+{
+    int64_t   tv_sec;
+    long long tv_nsec;
+};
+
+struct scm_timestamping
+{
+    struct timespec ts[3];
+};
+
+struct scm_old_timestamping
+{
+    struct kernel_old_timespec ts[3];
+};
+
+struct scm_timestamping64
+{
+    struct kernel_timespec ts[3];
+};
+
+const int so_timestamping_flags =
+    SOF_TIMESTAMPING_RX_HARDWARE |
+    SOF_TIMESTAMPING_RX_SOFTWARE |
+    SOF_TIMESTAMPING_SOFTWARE;
+
 int create_udp_socket(struct sockaddr_in *sockaddr)
 {
     socklen_t sockaddr_len;
@@ -62,43 +96,47 @@ int create_udp_socket(struct sockaddr_in *sockaddr)
 /* Checks that the timestamp in the message is not after the reception timestamp
  * as well as the reception time is within 10 seconds of the message time.
  */
-void check_timestamp_difference(const struct timeval *msg_tv,
-                                const struct timeval *pkt_tv)
+void check_timestamp_difference(const struct timespec *msg_ts,
+                                const struct timespec *pkt_ts)
 {
-    if (pkt_tv->tv_sec < msg_tv->tv_sec ||
-        (pkt_tv->tv_sec == msg_tv->tv_sec && pkt_tv->tv_usec < msg_tv->tv_usec))
+    if (pkt_ts->tv_sec < msg_ts->tv_sec ||
+        (pkt_ts->tv_sec == msg_ts->tv_sec && pkt_ts->tv_nsec < msg_ts->tv_nsec))
     {
         fprintf(stderr,
-                "Packet received before sent: %lld.%06lld < %lld.%06lld\n",
-                (long long)pkt_tv->tv_sec, (long long)pkt_tv->tv_usec,
-                (long long)msg_tv->tv_sec, (long long)msg_tv->tv_usec);
+                "Packet received before sent: %lld.%06lld < %lld.%09lld\n",
+                (long long)pkt_ts->tv_sec, (long long)pkt_ts->tv_nsec,
+                (long long)msg_ts->tv_sec, (long long)msg_ts->tv_nsec);
         exit(-1);
     }
 
-    if (pkt_tv->tv_sec > msg_tv->tv_sec + 10 ||
-        (pkt_tv->tv_sec == msg_tv->tv_sec + 10 &&
-         pkt_tv->tv_usec > msg_tv->tv_usec)) {
+    if (pkt_ts->tv_sec > msg_ts->tv_sec + 10 ||
+        (pkt_ts->tv_sec == msg_ts->tv_sec + 10 &&
+         pkt_ts->tv_nsec > msg_ts->tv_nsec)) {
         fprintf(stderr,
                 "Packet received more than 10 seconds after sent: "
-                "%lld.%06lld > %lld.%06lld + 10\n",
-                (long long)pkt_tv->tv_sec, (long long)pkt_tv->tv_usec,
-                (long long)msg_tv->tv_sec, (long long)msg_tv->tv_usec);
+                "%lld.%06lld > %lld.%09lld + 10\n",
+                (long long)pkt_ts->tv_sec, (long long)pkt_ts->tv_nsec,
+                (long long)msg_ts->tv_sec, (long long)msg_ts->tv_nsec);
         exit(-1);
     }
 }
 
 void send_current_time(int sock, struct sockaddr_in server_sockaddr)
 {
-    struct timeval tv = {0, 0};
-    gettimeofday(&tv, NULL);
-    sendto(sock, &tv, sizeof(tv), 0, (struct sockaddr *)&server_sockaddr,
+    struct timespec ts = {0, 0};
+    clock_gettime(CLOCK_REALTIME, &ts);
+#ifdef MSG_CONFIRM
+    const int flags = MSG_CONFIRM;
+#else
+    const int flags = 0;
+#endif
+    sendto(sock, &ts, sizeof(ts), flags, (struct sockaddr *)&server_sockaddr,
            sizeof(server_sockaddr));
 }
 
-typedef void (*get_timeval_t)(const struct cmsghdr *cmsg, struct timeval *tv);
-
+typedef void (*get_timespec_t)(const struct cmsghdr *cmsg, struct timespec *tv);
 
-void receive_packet(int sock, get_timeval_t get_timeval)
+void receive_packet(int sock, get_timespec_t get_timespec)
 {
     struct msghdr msg = {0};
 
@@ -106,12 +144,14 @@ void receive_packet(int sock, get_timeval_t get_timeval)
     struct iovec iov;
 
     union {
-        /* 128 bytes should cover all imaginable timeval/timespec*/
+        /* 128 bytes are enough for all existing
+         * timeval/timespec/scm_timestamping structures.
+         */
         char cmsg_buf[CMSG_SPACE(128)];
         struct cmsghdr align;
     } u;
     struct cmsghdr *cmsg;
-    struct timeval msg_tv, pkt_tv;
+    struct timespec msg_ts, pkt_ts;
 
     int res;
 
@@ -126,160 +166,370 @@ void receive_packet(int sock, get_timeval_t get_timeval)
     if ((res = recvmsg(sock, &msg, 0)) < 0) {
         int err = errno;
         fprintf(stderr, "Failed to receive packet: %s\n", strerror(err));
-        exit(-err);
+        exit(err);
     }
 
     assert(res == sizeof(struct timeval));
     assert(iov.iov_base == iobuf);
-    memcpy(&msg_tv, iov.iov_base, sizeof(msg_tv));
-    printf("Message timestamp: %lld.%06lld\n",
-           (long long)msg_tv.tv_sec, (long long)msg_tv.tv_usec);
+    memcpy(&msg_ts, iov.iov_base, sizeof(msg_ts));
+    printf("Message timestamp: %lld.%09lld\n",
+           (long long)msg_ts.tv_sec, (long long)msg_ts.tv_nsec);
 
     cmsg = CMSG_FIRSTHDR(&msg);
     assert(cmsg);
-    (*get_timeval)(cmsg, &pkt_tv);
-    printf("Packet timestamp: %lld.%06lld\n",
-           (long long)pkt_tv.tv_sec, (long long)pkt_tv.tv_usec);
+    (*get_timespec)(cmsg, &pkt_ts);
+    printf("Packet timestamp: %lld.%09lld\n",
+           (long long)pkt_ts.tv_sec, (long long)pkt_ts.tv_nsec);
 
-    check_timestamp_difference(&msg_tv, &pkt_tv);
+    check_timestamp_difference(&msg_ts, &pkt_ts);
 }
 
-void get_timeval_from_so_timestamp(const struct cmsghdr *cmsg,
-                                   struct timeval *tv)
+void get_timespec_from_so_timestamp(const struct cmsghdr *cmsg,
+                                    struct timespec *ts)
 {
+    struct timeval tv;
     assert(cmsg->cmsg_level == SOL_SOCKET);
     assert(cmsg->cmsg_type == SCM_TIMESTAMP);
-    assert(cmsg->cmsg_len >= CMSG_LEN(sizeof(struct timeval)));
-    memcpy(tv, CMSG_DATA(cmsg), sizeof(*tv));
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(tv)));
+
+    memcpy(&tv, CMSG_DATA(cmsg), sizeof(tv));
+    ts->tv_sec = tv.tv_sec;
+    ts->tv_nsec = tv.tv_usec * 1000LL;
 }
 
 #ifdef SO_TIMESTAMP_OLD
-void get_timeval_from_so_timestamp_old(const struct cmsghdr *cmsg,
-                                       struct timeval *tv)
+void get_timespec_from_so_timestamp_old(const struct cmsghdr *cmsg,
+                                        struct timespec *ts)
 {
     struct kernel_old_timeval old_tv;
     assert(cmsg->cmsg_level == SOL_SOCKET);
     assert(cmsg->cmsg_type == SO_TIMESTAMP_OLD);
-    assert(cmsg->cmsg_len >= CMSG_LEN(sizeof(old_tv)));
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(old_tv)));
 
     memcpy(&old_tv, CMSG_DATA(cmsg), sizeof(old_tv));
-    tv->tv_sec = old_tv.tv_sec;
-    tv->tv_usec = old_tv.tv_usec;
+    ts->tv_sec = old_tv.tv_sec;
+    ts->tv_nsec = old_tv.tv_usec * 1000LL;
 }
 
 #ifdef SO_TIMESTAMP_NEW
-void get_timeval_from_so_timestamp_new(const struct cmsghdr *cmsg,
-                                       struct timeval *tv)
+void get_timespec_from_so_timestamp_new(const struct cmsghdr *cmsg,
+                                        struct timespec *ts)
 {
     struct kernel_sock_timeval sock_tv;
     assert(cmsg->cmsg_level == SOL_SOCKET);
     assert(cmsg->cmsg_type == SO_TIMESTAMP_NEW);
-    assert(cmsg->cmsg_len >= CMSG_LEN(sizeof(sock_tv)));
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(sock_tv)));
 
     memcpy(&sock_tv, CMSG_DATA(cmsg), sizeof(sock_tv));
-    tv->tv_sec = sock_tv.tv_sec;
-    tv->tv_usec = sock_tv.tv_usec;
+    ts->tv_sec = sock_tv.tv_sec;
+    ts->tv_nsec = sock_tv.tv_usec * 1000LL;
 }
 #endif /* defined(SO_TIMESTAMP_NEW) */
 #endif /* defined(SO_TIMESTAMP_OLD) */
 
-void set_socket_option(int sock, int sockopt, int on)
+void get_timespec_from_so_timestampns(const struct cmsghdr *cmsg,
+                                      struct timespec *ts)
+{
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SCM_TIMESTAMPNS);
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(*ts)));
+
+    memcpy(ts, CMSG_DATA(cmsg), sizeof(*ts));
+}
+
+#ifdef SO_TIMESTAMPNS_OLD
+void get_timespec_from_so_timestampns_old(const struct cmsghdr *cmsg,
+                                          struct timespec *ts)
+{
+    struct kernel_old_timespec old_ts;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SO_TIMESTAMPNS_OLD);
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(old_ts)));
+
+    memcpy(&old_ts, CMSG_DATA(cmsg), sizeof(old_ts));
+    ts->tv_sec = old_ts.tv_sec;
+    ts->tv_nsec = old_ts.tv_nsec;
+}
+
+#ifdef SO_TIMESTAMPNS_NEW
+void get_timespec_from_so_timestampns_new(const struct cmsghdr *cmsg,
+                                          struct timespec *ts)
+{
+    struct kernel_timespec sock_ts;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SO_TIMESTAMPNS_NEW);
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(sock_ts)));
+
+    memcpy(&sock_ts, CMSG_DATA(cmsg), sizeof(sock_ts));
+    ts->tv_sec = sock_ts.tv_sec;
+    ts->tv_nsec = sock_ts.tv_nsec;
+}
+#endif /* defined(SO_TIMESTAMPNS_NEW) */
+#endif /* defined(SO_TIMESTAMPNS_OLD) */
+
+void get_timespec_from_so_timestamping(const struct cmsghdr *cmsg,
+                                       struct timespec *ts)
+{
+    int i;
+    struct scm_timestamping tss;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SCM_TIMESTAMPING);
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(tss)));
+
+    memcpy(&tss, CMSG_DATA(cmsg), sizeof(tss));
+
+    for (i = 0; i < 3; ++i) {
+        if (tss.ts[i].tv_sec || tss.ts[i].tv_nsec) {
+            *ts = tss.ts[i];
+            return;
+        }
+    }
+    assert(!"All three entries in scm_timestamping are empty");
+}
+
+#ifdef SO_TIMESTAMPING_OLD
+void get_timespec_from_so_timestamping_old(const struct cmsghdr *cmsg,
+                                           struct timespec *ts)
+{
+    int i;
+    struct scm_old_timestamping tss;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SO_TIMESTAMPING_OLD);
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(tss)));
+
+    memcpy(&tss, CMSG_DATA(cmsg), sizeof(tss));
+
+    for (i = 0; i < 3; ++i) {
+        if (tss.ts[i].tv_sec || tss.ts[i].tv_nsec) {
+            ts->tv_sec = tss.ts[i].tv_sec;
+            ts->tv_nsec = tss.ts[i].tv_nsec;
+            return;
+        }
+    }
+    assert(!"All three entries in scm_old_timestamping are empty");
+}
+
+#ifdef SO_TIMESTAMPING_NEW
+void get_timespec_from_so_timestamping_new(const struct cmsghdr *cmsg,
+                                           struct timespec *ts)
+{
+    int i;
+    struct scm_timestamping64 tss;
+    assert(cmsg->cmsg_level == SOL_SOCKET);
+    assert(cmsg->cmsg_type == SO_TIMESTAMPING_NEW);
+    assert(cmsg->cmsg_len == CMSG_LEN(sizeof(tss)));
+
+    memcpy(&tss, CMSG_DATA(cmsg), sizeof(tss));
+    for (i = 0; i < 3; ++i) {
+        if (tss.ts[i].tv_sec || tss.ts[i].tv_nsec) {
+            ts->tv_sec = tss.ts[i].tv_sec;
+            ts->tv_nsec = tss.ts[i].tv_nsec;
+            return;
+        }
+    }
+    assert(!"All three entries in scm_timestamp64 are empty");
+}
+#endif /* defined(SO_TIMESTAMPING_NEW) */
+#endif /* defined(SO_TIMESTAMPING_OLD) */
+
+void set_socket_option(int sock, int sockopt, int set_to)
 {
     socklen_t len;
-    int val = on;
+    int val = set_to;
     if (setsockopt(sock, SOL_SOCKET, sockopt, &val, sizeof(val)) < 0) {
         int err = errno;
-        fprintf(stderr, "Failed to setsockopt %d (%s): %s\n",
-                sockopt, on ? "on" : "off", strerror(err));
+        fprintf(stderr, "Failed at setsockopt(%d, SOL_SOCKET, %d, %d): %s\n",
+                sock, sockopt, set_to, strerror(err));
         exit(err);
     }
 
+#ifdef SO_TIMESTAMPING_NEW
+    if (sockopt == SO_TIMESTAMPING_NEW) {
+        /* `getsockopt(sock, SOL_SOCKET, SO_TIMESTAMPING_NEW)` not implemented
+         * as of linux kernel v5.8-rc4.
+         */
+        return;
+    }
+#endif
+
     len = sizeof(val);
     val = -1;
     if (getsockopt(sock, SOL_SOCKET, sockopt, &val, &len) < 0) {
         int err = errno;
-        fprintf(stderr, "Failed to getsockopt (%d): %s\n", sock, strerror(err));
+        fprintf(stderr, "Failed at getsockopt(%d, SOL_SOCKET, %d): %s\n",
+                sock, sockopt, strerror(err));
         exit(err);
     }
     assert(len == sizeof(val));
-    assert(val == on);
+    assert(val == set_to);
+}
+
+void child_steps(int sock, struct sockaddr_in addr, int run_old)
+{
+    /* Test 1: SO_TIMESTAMP */
+    send_current_time(sock, addr);
+
+    /* Test 2: SO_TIMESTAMPNS */
+    printf("Test 2: SO_TIMESTAMPNS\n");
+    set_socket_option(sock, SO_TIMESTAMPNS, 1);
+    receive_packet(sock, &get_timespec_from_so_timestampns);
+    set_socket_option(sock, SO_TIMESTAMPNS, 0);
+
+    /* Test 3: SO_TIMESTAMPING */
+    send_current_time(sock, addr);
+
+    if (!run_old) {
+        return;
+    }
+
+#ifdef SO_TIMESTAMP_OLD
+    if (SO_TIMESTAMP_OLD != SO_TIMESTAMP) {
+        /* Test 4a: SO_TIMESTAMP_OLD */
+        printf("Test 4a: SO_TIMESTAMP_OLD\n");
+        set_socket_option(sock, SO_TIMESTAMP_OLD, 1);
+        receive_packet(sock, &get_timespec_from_so_timestamp_old);
+        set_socket_option(sock, SO_TIMESTAMP_OLD, 0);
+    }
+#ifdef SO_TIMESTAMP_NEW
+    else {
+        /* Test 4b: SO_TIMESTAMP_NEW */
+        printf("Test 4b: SO_TIMESTAMP_NEW\n");
+        set_socket_option(sock, SO_TIMESTAMP_NEW, 1);
+        receive_packet(sock, &get_timespec_from_so_timestamp_new);
+        set_socket_option(sock, SO_TIMESTAMP_NEW, 0);
+    }
+#endif /* defined(SO_TIMESTAMP_NEW) */
+#endif /* defined(SO_TIMESTAMP_OLD) */
+
+#ifdef SO_TIMESTAMPNS_OLD
+    if (SO_TIMESTAMPNS_OLD != SO_TIMESTAMPNS) {
+        /* Test 5a: SO_TIMESTAMPNS_OLD */
+        send_current_time(sock, addr);
+    }
+#ifdef SO_TIMESTAMPNS_NEW
+    else {
+        /* Test 5b: SO_TIMESTAMPNS_NEW */
+        send_current_time(sock, addr);
+    }
+#endif /* defined(SO_TIMESTAMPNS_NEW) */
+#endif /* defined(SO_TIMESTAMPNS_OLD) */
+
+#ifdef SO_TIMESTAMPING_OLD
+    if (SO_TIMESTAMPING_OLD != SO_TIMESTAMPING) {
+        /* Test 6a: SO_TIMESTAMPING_OLD */
+        printf("Test 6a: SO_TIMESTAMPING_OLD\n");
+        set_socket_option(sock, SO_TIMESTAMPING_OLD, so_timestamping_flags);
+        receive_packet(sock, &get_timespec_from_so_timestamping_old);
+        set_socket_option(sock, SO_TIMESTAMPING_OLD, 0);
+    }
+#ifdef SO_TIMESTAMPING_NEW
+    else {
+        /* Test 6b: SO_TIMESTAMPING_NEW */
+        printf("Test 6b: SO_TIMESTAMPING_NEW\n");
+        set_socket_option(sock, SO_TIMESTAMPING_NEW, so_timestamping_flags);
+        receive_packet(sock, &get_timespec_from_so_timestamping_new);
+        set_socket_option(sock, SO_TIMESTAMPING_NEW, 0);
+    }
+#endif /* defined(SO_TIMESTAMPING_NEW) */
+#endif /* defined(SO_TIMESTAMPING_OLD) */
+}
+
+void parent_steps(int sock, struct sockaddr_in addr, int run_old)
+{
+    /* Test 1: SO_TIMESTAMP */
+    printf("Test 1: SO_TIMESTAMP\n");
+    set_socket_option(sock, SO_TIMESTAMP, 1);
+    receive_packet(sock, &get_timespec_from_so_timestamp);
+    set_socket_option(sock, SO_TIMESTAMP, 0);
+
+    /* Test 2: SO_TIMESTAMPNS */
+    send_current_time(sock, addr);
+
+    /* Test 3: SO_TIMESTAMPING */
+    printf("Test 3: SO_TIMESTAMPING\n");
+    set_socket_option(sock, SO_TIMESTAMPING, so_timestamping_flags);
+    receive_packet(sock, &get_timespec_from_so_timestamping);
+    set_socket_option(sock, SO_TIMESTAMPING, 0);
+
+    if (!run_old) {
+        return;
+    }
+
+#ifdef SO_TIMESTAMP_OLD
+    if (SO_TIMESTAMP_OLD != SO_TIMESTAMP) {
+        /* Test 4a: SO_TIMESTAMP_OLD */
+        send_current_time(sock, addr);
+    }
+#ifdef SO_TIMESTAMP_NEW
+    else {
+        /* Test 4b: SO_TIMESTAMP_NEW */
+        send_current_time(sock, addr);
+    }
+#endif /* defined(SO_TIMESTAMP_NEW) */
+#endif /* defined(SO_TIMESTAMP_OLD) */
+
+#ifdef SO_TIMESTAMPNS_OLD
+    if (SO_TIMESTAMPNS_OLD != SO_TIMESTAMPNS) {
+        /* Test 5a: SO_TIMESTAMPNS_OLD */
+        printf("Test 5a: SO_TIMESTAMPNS_OLD\n");
+        set_socket_option(sock, SO_TIMESTAMPNS_OLD, 1);
+        receive_packet(sock, &get_timespec_from_so_timestampns_old);
+        set_socket_option(sock, SO_TIMESTAMPNS_OLD, 0);
+    }
+#ifdef SO_TIMESTAMPNS_NEW
+    else {
+        /* Test 5b: SO_TIMESTAMPNS_NEW */
+        printf("Test 5b: SO_TIMESTAMPNS_NEW\n");
+        set_socket_option(sock, SO_TIMESTAMPNS_NEW, 1);
+        receive_packet(sock, &get_timespec_from_so_timestampns_new);
+        set_socket_option(sock, SO_TIMESTAMPNS_NEW, 0);
+    }
+#endif /* defined(SO_TIMESTAMPNS_NEW) */
+#endif /* defined(SO_TIMESTAMPNS_OLD) */
+
+#ifdef SO_TIMESTAMPING_OLD
+    if (SO_TIMESTAMPING_OLD != SO_TIMESTAMPING) {
+        /* Test 6a: SO_TIMESTAMPING_OLD */
+        send_current_time(sock, addr);
+    }
+#ifdef SO_TIMESTAMPING_NEW
+    else {
+        /* Test 6b: SO_TIMESTAMPING_NEW */
+        send_current_time(sock, addr);
+    }
+#endif /* defined(SO_TIMESTAMPING_NEW) */
+#endif /* defined(SO_TIMESTAMPING_OLD) */
 }
 
 int main(int argc, char **argv)
 {
     int parent_sock, child_sock;
     struct sockaddr_in parent_sockaddr, child_sockaddr;
-    int pid;
+    int pid, run_old;
     struct timeval tv = {0, 0};
     gettimeofday(&tv, NULL);
 
+    /* Too close to y2038 old systems may not work. */
+    run_old = tv.tv_sec < 0x7fffff00;
+
     parent_sock = create_udp_socket(&parent_sockaddr);
     child_sock = create_udp_socket(&child_sockaddr);
 
     printf("Parent sock bound to port %d\nChild sock bound to port %d\n",
            parent_sockaddr.sin_port, child_sockaddr.sin_port);
 
-    if ((pid = fork()) == -2) {
+    if ((pid = fork()) < 0) {
         fprintf(stderr, "SKIPPED. Failed to fork: %s\n", strerror(errno));
     } else if (pid == 0) {
-        close(child_sock);
-
-        /* Test 1: SO_TIMESTAMP */
-        send_current_time(parent_sock, child_sockaddr);
-
-        if (tv.tv_sec > 0x7fffff00) {
-            /* Too close to y2038 problem, old system may not work. */
-            close(parent_sock);
-            return 0;
-        }
-
-#ifdef SO_TIMESTAMP_OLD
-        if (SO_TIMESTAMP_OLD != SO_TIMESTAMP) {
-            /* Test 2a: SO_TIMESTAMP_OLD */
-            set_socket_option(parent_sock, SO_TIMESTAMP_OLD, 1);
-            receive_packet(parent_sock, &get_timeval_from_so_timestamp_old);
-            set_socket_option(parent_sock, SO_TIMESTAMP_OLD, 0);
-        }
-#ifdef SO_TIMESTAMP_NEW
-        else {
-            /* Test 2b: SO_TIMESTAMP_NEW */
-            set_socket_option(parent_sock, SO_TIMESTAMP_NEW, 1);
-            receive_packet(parent_sock, &get_timeval_from_so_timestamp_new);
-            set_socket_option(parent_sock, SO_TIMESTAMP_NEW, 0);
-        }
-#endif /* defined(SO_TIMESTAMP_NEW) */
-#endif /* defined(SO_TIMESTAMP_OLD) */
-
         close(parent_sock);
+        child_steps(child_sock, parent_sockaddr, run_old);
+        close(child_sock);
     } else {
         int child_status;
-        close(parent_sock);
-
-        /* Test 1: SO_TIMESTAMP */
-        set_socket_option(child_sock, SO_TIMESTAMP, 1);
-        receive_packet(child_sock, &get_timeval_from_so_timestamp);
-        set_socket_option(child_sock, SO_TIMESTAMP, 0);
-
-        if (tv.tv_sec > 0x7fffff00) {
-            /* Too close to y2038 problem, old system may not work. */
-            close(child_sock);
-            return 0;
-        }
-
-#ifdef SO_TIMESTAMP_OLD
-        if (SO_TIMESTAMP_OLD != SO_TIMESTAMP) {
-            /* Test 2a: SO_TIMESTAMP_OLD */
-            send_current_time(child_sock, parent_sockaddr);
-        }
-#ifdef SO_TIMESTAMP_NEW
-        else {
-            /* Test 2b: SO_TIMESTAMP_NEW */
-            send_current_time(child_sock, parent_sockaddr);
-        }
-#endif /* defined(SO_TIMESTAMP_NEW) */
-#endif /* defined(SO_TIMESTAMP_OLD) */
 
         close(child_sock);
+        parent_steps(parent_sock, child_sockaddr, run_old);
+        close(parent_sock);
 
         if (waitpid(pid, &child_status, 0) < 0) {
             int err = errno;
-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] thunk: supports flexible arrays
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
                   ` (3 preceding siblings ...)
  2020-07-23  0:19 ` [PATCH 4/6] linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING Shu-Chun Weng
@ 2020-07-23  0:19 ` Shu-Chun Weng
  2020-07-23  0:19 ` [PATCH 6/6] linux-user: Add support for SIOCETHTOOL ioctl Shu-Chun Weng
  2020-08-05 23:22 ` [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
  6 siblings, 0 replies; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent, riku.voipio

Flexible arrays may appear in the last field of a struct and are heavily
used in the ioctl(SIOCETHTOOL) system call on Linux. E.g.

  struct ethtool_regs {
      __u32   cmd;
      __u32   version; /* driver-specific, indicates different chips/revs */
      __u32   len; /* bytes */
      __u8    data[0];
  };

where number of elements in `data` is specified in `len`. It is translated
into:

  STRUCT(ethtool_regs,
         TYPE_INT, /* cmd */
         TYPE_INT, /* version */
         TYPE_INT, /* len */
         MK_FLEXIBLE_ARRAY(TYPE_CHAR, 2)) /* data[0]: len */

where the "2" passed to `MK_FLEXIBLE_ARRAY` means the number of element
is specified by field number 2 (0-index).

Signed-off-by: Shu-Chun Weng <scw@google.com>
---
 include/exec/user/thunk.h |  20 +++++
 thunk.c                   | 151 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 169 insertions(+), 2 deletions(-)

diff --git a/include/exec/user/thunk.h b/include/exec/user/thunk.h
index 7992475c9f..080d84e806 100644
--- a/include/exec/user/thunk.h
+++ b/include/exec/user/thunk.h
@@ -39,12 +39,19 @@ typedef enum argtype {
     TYPE_ARRAY,
     TYPE_STRUCT,
     TYPE_OLDDEVT,
+    TYPE_FLEXIBLE_ARRAY,
 } argtype;
 
 #define MK_PTR(type) TYPE_PTR, type
 #define MK_ARRAY(type, size) TYPE_ARRAY, size, type
 #define MK_STRUCT(id) TYPE_STRUCT, id
 
+/* Should only appear as the last element of a TYPE_STRUCT. `len_field_idx` is
+ * the index into the fields in the enclosing struct that specify the length of
+ * the flexibly array. The length field MUST be a TYPE_INT field. */
+#define MK_FLEXIBLE_ARRAY(type, len_field_idx) \
+    TYPE_FLEXIBLE_ARRAY, len_field_idx, type
+
 #define THUNK_TARGET 0
 #define THUNK_HOST   1
 
@@ -55,6 +62,8 @@ typedef struct {
     int *field_offsets[2];
     /* special handling */
     void (*convert[2])(void *dst, const void *src);
+    int (*thunk_size[2])(const void *src);
+
     int size[2];
     int align[2];
     const char *name;
@@ -75,6 +84,11 @@ const argtype *thunk_convert(void *dst, const void *src,
                              const argtype *type_ptr, int to_host);
 const argtype *thunk_print(void *arg, const argtype *type_ptr);
 
+bool thunk_type_has_flexible_array(const argtype *type_ptr);
+/* thunk_type_size but can handle TYPE_FLEXIBLE_ARRAY */
+int thunk_type_size_with_src(const void *src, const argtype *type_ptr,
+                             int is_host);
+
 extern StructEntry *struct_entries;
 
 int thunk_type_size_array(const argtype *type_ptr, int is_host);
@@ -137,6 +151,10 @@ static inline int thunk_type_size(const argtype *type_ptr, int is_host)
     case TYPE_STRUCT:
         se = struct_entries + type_ptr[1];
         return se->size[is_host];
+    case TYPE_FLEXIBLE_ARRAY:
+        /* Flexible arrays do not count toward sizeof(). Users of structures
+         * containing them need to calculate it themselves. */
+        return 0;
     default:
         g_assert_not_reached();
     }
@@ -187,6 +205,8 @@ static inline int thunk_type_align(const argtype *type_ptr, int is_host)
     case TYPE_STRUCT:
         se = struct_entries + type_ptr[1];
         return se->align[is_host];
+    case TYPE_FLEXIBLE_ARRAY:
+        return thunk_type_align_array(type_ptr + 2, is_host);
     default:
         g_assert_not_reached();
     }
diff --git a/thunk.c b/thunk.c
index c5d9719747..7b89332712 100644
--- a/thunk.c
+++ b/thunk.c
@@ -50,6 +50,8 @@ static inline const argtype *thunk_type_next(const argtype *type_ptr)
         return thunk_type_next_ptr(type_ptr + 1);
     case TYPE_STRUCT:
         return type_ptr + 1;
+    case TYPE_FLEXIBLE_ARRAY:
+        return thunk_type_next_ptr(type_ptr + 1);
     default:
         return NULL;
     }
@@ -122,6 +124,34 @@ void thunk_register_struct_direct(int id, const char *name,
     se->name = name;
 }
 
+static const argtype *
+thunk_convert_flexible_array(void *dst, const void *src,
+                             const uint8_t *dst_struct,
+                             const uint8_t *src_struct, const argtype *type_ptr,
+                             const StructEntry *se, int to_host) {
+    int len_field_idx, dst_size, src_size, i;
+    uint32_t array_length;
+    uint8_t *d;
+    const uint8_t *s;
+
+    assert(*type_ptr == TYPE_FLEXIBLE_ARRAY);
+    type_ptr++;
+    len_field_idx = *type_ptr++;
+    array_length =
+        *(const uint32_t *)(to_host ?
+                            dst_struct + se->field_offsets[1][len_field_idx] :
+                            src_struct + se->field_offsets[0][len_field_idx]);
+    dst_size = thunk_type_size(type_ptr, to_host);
+    src_size = thunk_type_size(type_ptr, to_host);
+    d = dst;
+    s = src;
+    for (i = 0; i < array_length; i++) {
+        thunk_convert(d, s, type_ptr, to_host);
+        d += dst_size;
+        s += src_size;
+    }
+    return thunk_type_next(type_ptr);
+}
 
 /* now we can define the main conversion functions */
 const argtype *thunk_convert(void *dst, const void *src,
@@ -246,7 +276,7 @@ const argtype *thunk_convert(void *dst, const void *src,
 
             assert(*type_ptr < max_struct_entries);
             se = struct_entries + *type_ptr++;
-            if (se->convert[0] != NULL) {
+            if (se->convert[to_host] != NULL) {
                 /* specific conversion is needed */
                 (*se->convert[to_host])(dst, src);
             } else {
@@ -256,7 +286,18 @@ const argtype *thunk_convert(void *dst, const void *src,
                 src_offsets = se->field_offsets[1 - to_host];
                 d = dst;
                 s = src;
-                for(i = 0;i < se->nb_fields; i++) {
+                for(i = 0; i < se->nb_fields; i++) {
+                    if (*field_types == TYPE_FLEXIBLE_ARRAY) {
+                        field_types = thunk_convert_flexible_array(
+                            d + dst_offsets[i],
+                            s + src_offsets[i],
+                            d,
+                            s,
+                            field_types,
+                            se,
+                            to_host);
+                        continue;
+                    }
                     field_types = thunk_convert(d + dst_offsets[i],
                                                 s + src_offsets[i],
                                                 field_types, to_host);
@@ -264,6 +305,11 @@ const argtype *thunk_convert(void *dst, const void *src,
             }
         }
         break;
+    case TYPE_FLEXIBLE_ARRAY:
+        fprintf(stderr,
+                "Invalid flexible array (type 0x%x) outside of a structure\n",
+                type);
+        break;
     default:
         fprintf(stderr, "Invalid type 0x%x\n", type);
         break;
@@ -271,6 +317,45 @@ const argtype *thunk_convert(void *dst, const void *src,
     return type_ptr;
 }
 
+static const argtype *
+thunk_print_flexible_array(void *arg, const uint8_t *arg_struct,
+                           const argtype *type_ptr, const StructEntry *se) {
+    int array_length, len_field_idx, arg_size, i;
+    uint8_t *a;
+    int is_string = 0;
+
+    assert(*type_ptr == TYPE_FLEXIBLE_ARRAY);
+    type_ptr++;
+    len_field_idx = *type_ptr++;
+
+    array_length = tswap32(
+        *(const uint32_t *)(arg_struct + se->field_offsets[0][len_field_idx]));
+    arg_size = thunk_type_size(type_ptr, 0);
+    a = arg;
+
+    if (*type_ptr == TYPE_CHAR) {
+        qemu_log("\"");
+        is_string = 1;
+    } else {
+        qemu_log("[");
+    }
+
+    for (i = 0; i < array_length; i++) {
+        if (i > 0 && !is_string) {
+            qemu_log(",");
+        }
+        thunk_print(a, type_ptr);
+        a += arg_size;
+    }
+
+    if (is_string) {
+        qemu_log("\"");
+    } else {
+        qemu_log("]");
+    }
+    return thunk_type_next(type_ptr);
+}
+
 const argtype *thunk_print(void *arg, const argtype *type_ptr)
 {
     int type;
@@ -414,17 +499,79 @@ const argtype *thunk_print(void *arg, const argtype *type_ptr)
                 if (i > 0) {
                     qemu_log(",");
                 }
+                if (*field_types == TYPE_FLEXIBLE_ARRAY) {
+                    field_types = thunk_print_flexible_array(
+                        a + arg_offsets[i], a, field_types, se);
+                    continue;
+                }
                 field_types = thunk_print(a + arg_offsets[i], field_types);
             }
             qemu_log("}");
         }
         break;
+    case TYPE_FLEXIBLE_ARRAY:
+        fprintf(stderr,
+                "Invalid flexible array (type 0x%x) outside of a structure\n",
+                type);
+        break;
     default:
         g_assert_not_reached();
     }
     return type_ptr;
 }
 
+bool thunk_type_has_flexible_array(const argtype *type_ptr) {
+  int i;
+  const StructEntry *se;
+  const argtype *field_types;
+    if (*type_ptr != TYPE_STRUCT) {
+        return false;
+    }
+    se = struct_entries + type_ptr[1];
+    field_types = se->field_types;
+    for (i = 0; i < se->nb_fields; i++) {
+        if (*field_types == TYPE_FLEXIBLE_ARRAY) {
+            return true;
+        }
+        field_types = thunk_type_next(type_ptr);
+    }
+    return false;
+}
+
+int thunk_type_size_with_src(const void *src, const argtype *type_ptr,
+                             int is_host)
+{
+    switch(*type_ptr) {
+    case TYPE_STRUCT: {
+        int i;
+        const StructEntry *se = struct_entries + type_ptr[1];
+        const argtype *field_types;
+        if (se->thunk_size[is_host] != NULL) {
+            return (*se->thunk_size[is_host])(src);
+        }
+
+        field_types = se->field_types;
+        for (i = 0; i < se->nb_fields; i++) {
+            if (*field_types == TYPE_FLEXIBLE_ARRAY) {
+                uint32_t array_length = *(const uint32_t *)(
+                    (const uint8_t*) src +
+                    se->field_offsets[is_host][field_types[1]]);
+                if (!is_host) {
+                    array_length = tswap32(array_length);
+                }
+                return se->size[is_host] +
+                    array_length *
+                    thunk_type_size(field_types + 2, is_host);
+            }
+            field_types = thunk_type_next(type_ptr);
+        }
+        return se->size[is_host];
+    }
+    default:
+        return thunk_type_size(type_ptr, is_host);
+    }
+}
+
 /* from em86 */
 
 /* Utility function: Table-driven functions to translate bitmasks
-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] linux-user: Add support for SIOCETHTOOL ioctl
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
                   ` (4 preceding siblings ...)
  2020-07-23  0:19 ` [PATCH 5/6] thunk: supports flexible arrays Shu-Chun Weng
@ 2020-07-23  0:19 ` Shu-Chun Weng
  2020-08-05 23:22 ` [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
  6 siblings, 0 replies; 11+ messages in thread
From: Shu-Chun Weng @ 2020-07-23  0:19 UTC (permalink / raw)
  To: qemu-devel; +Cc: Shu-Chun Weng, laurent, riku.voipio, alex.bennee

The ioctl numeric values are platform-independent and determined by
the file include/uapi/linux/sockios.h in Linux kernel source code:

  #define SIOCETHTOOL   0x8946

These ioctls get (or set) various structures pointed by the field
ifr_data in the structure ifreq depending on the first 4 bytes of the
memory region.

This change clones the ioctl framework into ethtool-specific dispatch
logic in its own file. A number of definitions previously only visible
in syscall.c are thus exported to syscall_defs.h to be used in the new
files.

Signed-off-by: Shu-Chun Weng <scw@google.com>
---
 linux-user/Makefile.objs      |   3 +-
 linux-user/ethtool.c          | 819 ++++++++++++++++++++++++++++++++++
 linux-user/ethtool.h          |  19 +
 linux-user/ethtool_entries.h  | 107 +++++
 linux-user/ioctls.h           |   2 +
 linux-user/qemu.h             |   1 +
 linux-user/syscall.c          |  35 +-
 linux-user/syscall_defs.h     |  12 +
 linux-user/syscall_types.h    | 277 ++++++++++++
 tests/tcg/multiarch/ethtool.c | 417 +++++++++++++++++
 10 files changed, 1680 insertions(+), 12 deletions(-)
 create mode 100644 linux-user/ethtool.c
 create mode 100644 linux-user/ethtool.h
 create mode 100644 linux-user/ethtool_entries.h
 create mode 100644 tests/tcg/multiarch/ethtool.c

diff --git a/linux-user/Makefile.objs b/linux-user/Makefile.objs
index 1940910a73..971d43173a 100644
--- a/linux-user/Makefile.objs
+++ b/linux-user/Makefile.objs
@@ -1,7 +1,8 @@
 obj-y = main.o syscall.o strace.o mmap.o signal.o \
 	elfload.o linuxload.o uaccess.o uname.o \
 	safe-syscall.o $(TARGET_ABI_DIR)/signal.o \
-        $(TARGET_ABI_DIR)/cpu_loop.o exit.o fd-trans.o
+	$(TARGET_ABI_DIR)/cpu_loop.o exit.o fd-trans.o \
+	ethtool.o
 
 obj-$(TARGET_HAS_BFLT) += flatload.o
 obj-$(TARGET_I386) += vm86.o
diff --git a/linux-user/ethtool.c b/linux-user/ethtool.c
new file mode 100644
index 0000000000..cb134e7c9b
--- /dev/null
+++ b/linux-user/ethtool.c
@@ -0,0 +1,819 @@
+/*
+ *  Linux ioctl system call SIOCETHTOOL requests
+ *
+ *  Copyright (c) 2020 Shu-Chun Weng
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "qemu/osdep.h"
+#include <stdio.h>
+#include <linux/ethtool.h>
+#include <linux/if.h>
+#include <linux/sockios.h>
+#include <linux/unistd.h>
+#include "ethtool.h"
+#include "qemu.h"
+#include "syscall_defs.h"
+
+/* Non-standard ethtool structure definitions. */
+/* struct ethtool_rxnfc {
+ *     __u32 cmd;
+ *     __u32 flow_type;
+ *     __u64 data;
+ *     struct ethtool_rx_flow_spec fs;
+ *     union {
+ *         __u32 rule_cnt;
+ *         __u32 rss_context;
+ *     };
+ *     __u32 rule_locs[0];
+ * };
+ *
+ * Originally defined for ETHTOOL_{G,S}RXFH with only the cmd, flow_type and
+ * data members. For other commands, dedicated standard structure definitions
+ * are listed in syscall_types.h.
+ */
+static void host_to_target_ethtool_rxnfc_get_set_rxfh(void *dst,
+                                                      const void *src)
+{
+    static const argtype ethtool_rx_flow_spec_argtype[] = {
+        MK_STRUCT(STRUCT_ethtool_rx_flow_spec), TYPE_NULL };
+    struct ethtool_rxnfc *target = dst;
+    const struct ethtool_rxnfc *host = src;
+
+    target->cmd = tswap32(host->cmd);
+    target->flow_type = tswap32(host->flow_type);
+    target->data = tswap64(host->data);
+
+    if (host->cmd == ETHTOOL_SRXFH) {
+        /* struct ethtool_rxnfc was originally defined for ETHTOOL_{G,S}RXFH
+         * with only the cmd, flow_type and data members. Guest program might
+         * still be using that definition.
+         */
+        return;
+    }
+    if (host->cmd != ETHTOOL_GRXFH) {
+        fprintf(stderr, "host_to_target_ethtool_rxnfc_get_set_rxfh called with "
+                "command 0x%x which is not ETHTOOL_SRXFH or ETHTOOL_GRXFH\n",
+                host->cmd);
+    }
+    if ((host->flow_type & FLOW_RSS) == 0) {
+        return;
+    }
+    /* If `FLOW_RSS` was requested then guest program must be using the new
+     * definition.
+     */
+    thunk_convert(&target->fs, &host->fs, ethtool_rx_flow_spec_argtype,
+                  THUNK_TARGET);
+    target->rule_cnt = tswap32(host->rule_cnt);
+}
+
+static void target_to_host_ethtool_rxnfc_get_set_rxfh(void *dst,
+                                                      const void *src)
+{
+    static const argtype ethtool_rx_flow_spec_argtype[] = {
+        MK_STRUCT(STRUCT_ethtool_rx_flow_spec), TYPE_NULL };
+    struct ethtool_rxnfc *host = dst;
+    const struct ethtool_rxnfc *target = src;
+
+    host->cmd = tswap32(target->cmd);
+    host->flow_type = tswap32(target->flow_type);
+    host->data = tswap64(target->data);
+
+    if (host->cmd == ETHTOOL_SRXFH) {
+        /* struct ethtool_rxnfc was originally defined for ETHTOOL_{G,S}RXFH
+         * with only the cmd, flow_type and data members. Guest program might
+         * still be using that definition.
+         */
+        return;
+    }
+    if (host->cmd != ETHTOOL_GRXFH) {
+        fprintf(stderr, "target_to_host_ethtool_rxnfc_get_set_rxfh called with "
+                "command 0x%x which is not ETHTOOL_SRXFH or ETHTOOL_GRXFH\n",
+                host->cmd);
+    }
+    if ((host->flow_type & FLOW_RSS) == 0) {
+        return;
+    }
+    /* If `FLOW_RSS` was requested then guest program must be using the new
+     * definition.
+     */
+    thunk_convert(&host->fs, &target->fs, ethtool_rx_flow_spec_argtype,
+                  THUNK_HOST);
+    host->rule_cnt = tswap32(target->rule_cnt);
+}
+
+static int target_ethtool_rxnfc_get_set_rxfh_size(const void *src)
+{
+    const struct ethtool_rxnfc *target = src;
+    int cmd = tswap32(target->cmd);
+    if (cmd == ETHTOOL_SRXFH ||
+        (cmd == ETHTOOL_GRXFH &&
+         (tswap32(target->flow_type) & FLOW_RSS) == 0)) {
+        return 16;
+    }
+    return sizeof(struct ethtool_rxnfc);
+}
+
+static int host_ethtool_rxnfc_get_set_rxfh_size(const void *src)
+{
+    const struct ethtool_rxnfc *host = src;
+    if (host->cmd == ETHTOOL_SRXFH ||
+        (host->cmd == ETHTOOL_GRXFH && (host->flow_type & FLOW_RSS) == 0)) {
+        return 16;
+    }
+    return sizeof(struct ethtool_rxnfc);
+}
+
+const StructEntry struct_ethtool_rxnfc_get_set_rxfh_def = {
+    .convert = {
+        host_to_target_ethtool_rxnfc_get_set_rxfh,
+        target_to_host_ethtool_rxnfc_get_set_rxfh },
+    .thunk_size = {
+        target_ethtool_rxnfc_get_set_rxfh_size,
+        host_ethtool_rxnfc_get_set_rxfh_size },
+    .size = { 16, 16 },
+    .align = {
+        __alignof__(struct ethtool_rxnfc),
+        __alignof__(struct ethtool_rxnfc) },
+};
+
+/* struct ethtool_sset_info {
+ *     __u32 cmd;
+ *     __u32 reserved;
+ *     __u64 sset_mask;
+ *     __u32 data[0];
+ * };
+ *
+ * `sset_mask` is a bitmask of string sets. `data` is the buffer for string set
+ * sizes, containing number of 1s in `sset_mask`'s binary representation number
+ * of 4-byte entries.
+ *
+ * Since all fields are fixed-width and number of 1s in `sset_mask` does not
+ * change between architectures, host-to-target and target-to-host are
+ * identical.
+ */
+static void convert_ethtool_sset_info(void *dst, const void *src)
+{
+    int i, set_count;
+    struct ethtool_sset_info *dst_sset_info = dst;
+    const struct ethtool_sset_info *src_sset_info = src;
+
+    dst_sset_info->cmd = tswap32(src_sset_info->cmd);
+    dst_sset_info->sset_mask = tswap64(src_sset_info->sset_mask);
+
+    set_count = ctpop64(src_sset_info->sset_mask);
+    for (i = 0; i < set_count; ++i) {
+        dst_sset_info->data[i] = tswap32(src_sset_info->data[i]);
+    }
+}
+
+static int ethtool_sset_info_size(const void *src)
+{
+    const struct ethtool_sset_info *src_sset_info = src;
+    return sizeof(struct ethtool_sset_info) +
+        ctpop64(src_sset_info->sset_mask) * sizeof(src_sset_info->data[0]);
+}
+
+const StructEntry struct_ethtool_sset_info_def = {
+    .convert = {
+        convert_ethtool_sset_info, convert_ethtool_sset_info },
+    .thunk_size = {
+        ethtool_sset_info_size, ethtool_sset_info_size },
+    .size = {
+        sizeof(struct ethtool_sset_info),
+        sizeof(struct ethtool_sset_info) },
+    .align = {
+        __alignof__(struct ethtool_sset_info),
+        __alignof__(struct ethtool_sset_info) },
+};
+
+/* struct ethtool_rxfh {
+ *     __u32 cmd;
+ *     __u32 rss_context;
+ *     __u32 indir_size;
+ *     __u32 key_size;
+ *     __u8  hfunc;
+ *     __u8  rsvd8[3];
+ *     __u32 rsvd32;
+ *     __u32 rss_config[0];
+ * };
+ *
+ * `rss_config`: indirection table of `indir_size` __u32 elements, followed by
+ * hash key of `key_size` bytes.
+ *
+ * `indir_size` could be ETH_RXFH_INDIR_NO_CHANGE when `cmd` is ETHTOOL_SRSSH
+ * and there would be no indircetion table in `rss_config`.
+ */
+static void convert_ethtool_rxfh_header(void *dst, const void *src)
+{
+    struct ethtool_rxfh *dst_rxfh = dst;
+    const struct ethtool_rxfh *src_rxfh = src;
+
+    dst_rxfh->cmd = tswap32(src_rxfh->cmd);
+    dst_rxfh->rss_context = tswap32(src_rxfh->rss_context);
+    dst_rxfh->indir_size = tswap32(src_rxfh->indir_size);
+    dst_rxfh->key_size = tswap32(src_rxfh->key_size);
+    dst_rxfh->hfunc = src_rxfh->hfunc;
+    dst_rxfh->rsvd8[0] = src_rxfh->rsvd8[0];
+    dst_rxfh->rsvd8[1] = src_rxfh->rsvd8[1];
+    dst_rxfh->rsvd8[2] = src_rxfh->rsvd8[2];
+    dst_rxfh->rsvd32 = tswap32(src_rxfh->rsvd32);
+}
+
+static void convert_ethtool_rxfh_rss_config(
+    void *dst, const void *src, uint32_t indir_size, uint32_t key_size) {
+    uint32_t *dst_rss_config = (uint32_t *)dst;
+    const uint32_t *src_rss_config = (const uint32_t *)src;
+    int i;
+    for (i = 0; i < indir_size; ++i) {
+        dst_rss_config[i] = tswap32(src_rss_config[i]);
+    }
+    if (key_size > 0) {
+        memcpy(dst_rss_config + indir_size,
+               src_rss_config + indir_size,
+               key_size);
+    }
+}
+
+static void host_to_target_ethtool_rxfh(void *dst, const void *src)
+{
+    struct ethtool_rxfh *target = dst;
+    const struct ethtool_rxfh *host = src;
+
+    convert_ethtool_rxfh_header(dst, src);
+
+    const uint32_t indir_size =
+        host->cmd == ETHTOOL_SRSSH &&
+        host->indir_size == ETH_RXFH_INDIR_NO_CHANGE ?
+        0 :
+        host->indir_size;
+    convert_ethtool_rxfh_rss_config(target->rss_config, host->rss_config,
+                                    indir_size, host->key_size);
+}
+
+static void target_to_host_ethtool_rxfh(void *dst, const void *src)
+{
+    struct ethtool_rxfh *host = dst;
+    const struct ethtool_rxfh *target = src;
+
+    convert_ethtool_rxfh_header(dst, src);
+
+    const uint32_t indir_size =
+        host->cmd == ETHTOOL_SRSSH &&
+        host->indir_size == ETH_RXFH_INDIR_NO_CHANGE ?
+        0 :
+        host->indir_size;
+    convert_ethtool_rxfh_rss_config(host->rss_config, target->rss_config,
+                                    indir_size, host->key_size);
+}
+
+static int target_ethtool_rxfh_size(const void *src)
+{
+    const struct ethtool_rxfh *target = src;
+    if (tswap32(target->cmd) == ETHTOOL_SRSSH &&
+        tswap32(target->indir_size) == ETH_RXFH_INDIR_NO_CHANGE) {
+        return sizeof(struct ethtool_rxfh) + tswap32(target->key_size);
+    }
+    return sizeof(struct ethtool_rxfh) +
+        tswap32(target->indir_size) * sizeof(target->rss_config[0]) +
+        tswap32(target->key_size);
+}
+
+static int host_ethtool_rxfh_size(const void *src)
+{
+    const struct ethtool_rxfh *host = src;
+    if (host->cmd == ETHTOOL_SRSSH &&
+        host->indir_size == ETH_RXFH_INDIR_NO_CHANGE) {
+        return sizeof(struct ethtool_rxfh) + host->key_size;
+    }
+    return sizeof(struct ethtool_rxfh) +
+        host->indir_size * sizeof(host->rss_config[0]) +
+        host->key_size;
+}
+
+const StructEntry struct_ethtool_rxfh_def = {
+    .convert = {
+        host_to_target_ethtool_rxfh, target_to_host_ethtool_rxfh },
+    .thunk_size = {
+        target_ethtool_rxfh_size, host_ethtool_rxfh_size },
+    .size = {
+        sizeof(struct ethtool_rxfh), sizeof(struct ethtool_rxfh) },
+    .align = {
+        __alignof__(struct ethtool_rxfh), __alignof__(struct ethtool_rxfh) },
+};
+
+/* struct ethtool_link_settings {
+ *     __u32 cmd;
+ *     __u32 speed;
+ *     __u8  duplex;
+ *     __u8  port;
+ *     __u8  phy_address;
+ *     __u8  autoneg;
+ *     __u8  mdio_support;
+ *     __u8  eth_tp_mdix;
+ *     __u8  eth_tp_mdix_ctrl;
+ *     __s8  link_mode_masks_nwords;
+ *     __u8  transceiver;
+ *     __u8  reserved1[3];
+ *     __u32 reserved[7];
+ *     __u32 link_mode_masks[0];
+ * };
+ *
+ * layout of link_mode_masks fields:
+ * __u32 map_supported[link_mode_masks_nwords];
+ * __u32 map_advertising[link_mode_masks_nwords];
+ * __u32 map_lp_advertising[link_mode_masks_nwords];
+ *
+ * `link_mode_masks_nwords` can be negative when returning from kernel if the
+ * provided request size is not supported.
+ */
+
+static void host_to_target_ethtool_link_settings(void *dst, const void *src)
+{
+    int i;
+    struct ethtool_link_settings *target = dst;
+    const struct ethtool_link_settings *host = src;
+
+    target->cmd = tswap32(host->cmd);
+    target->speed = tswap32(host->speed);
+    target->duplex = host->duplex;
+    target->port = host->port;
+    target->phy_address = host->phy_address;
+    target->autoneg = host->autoneg;
+    target->mdio_support = host->mdio_support;
+    target->eth_tp_mdix = host->eth_tp_mdix;
+    target->eth_tp_mdix_ctrl = host->eth_tp_mdix_ctrl;
+    target->link_mode_masks_nwords = host->link_mode_masks_nwords;
+    target->transceiver = host->transceiver;
+    for (i = 0; i < 3; ++i) {
+        target->reserved1[i] = host->reserved1[i];
+    }
+    for (i = 0; i < 7; ++i) {
+        target->reserved[i] = tswap32(host->reserved[i]);
+    }
+
+    if (host->link_mode_masks_nwords > 0) {
+        for (i = 0; i < host->link_mode_masks_nwords * 3; ++i) {
+            target->link_mode_masks[i] = tswap32(host->link_mode_masks[i]);
+        }
+    }
+}
+
+static void target_to_host_ethtool_link_settings(void *dst, const void *src)
+{
+    int i;
+    struct ethtool_link_settings *host = dst;
+    const struct ethtool_link_settings *target = src;
+
+    host->cmd = tswap32(target->cmd);
+    host->speed = tswap32(target->speed);
+    host->duplex = target->duplex;
+    host->port = target->port;
+    host->phy_address = target->phy_address;
+    host->autoneg = target->autoneg;
+    host->mdio_support = target->mdio_support;
+    host->eth_tp_mdix = target->eth_tp_mdix;
+    host->eth_tp_mdix_ctrl = target->eth_tp_mdix_ctrl;
+    host->link_mode_masks_nwords = target->link_mode_masks_nwords;
+    host->transceiver = target->transceiver;
+    for (i = 0; i < 3; ++i) {
+        host->reserved1[i] = target->reserved1[i];
+    }
+    for (i = 0; i < 7; ++i) {
+        host->reserved[i] = tswap32(target->reserved[i]);
+    }
+
+    if (host->link_mode_masks_nwords > 0) {
+        for (i = 0; i < host->link_mode_masks_nwords * 3; ++i) {
+            host->link_mode_masks[i] = tswap32(target->link_mode_masks[i]);
+        }
+    }
+}
+
+static int target_ethtool_link_settings_size(const void *src)
+{
+    const struct ethtool_link_settings *target = src;
+    if (target->link_mode_masks_nwords > 0) {
+        return sizeof(struct ethtool_link_settings) +
+            3 * target->link_mode_masks_nwords *
+            sizeof(target->link_mode_masks[0]);
+    } else {
+        return sizeof(struct ethtool_link_settings);
+    }
+}
+
+static int host_ethtool_link_settings_size(const void *src)
+{
+    const struct ethtool_link_settings *host = src;
+    if (host->link_mode_masks_nwords > 0) {
+        return sizeof(struct ethtool_link_settings) +
+            3 * host->link_mode_masks_nwords *
+            sizeof(host->link_mode_masks[0]);
+    } else {
+        return sizeof(struct ethtool_link_settings);
+    }
+}
+
+const StructEntry struct_ethtool_link_settings_def = {
+    .convert = {
+        host_to_target_ethtool_link_settings,
+        target_to_host_ethtool_link_settings
+    },
+    .thunk_size = {
+        target_ethtool_link_settings_size, host_ethtool_link_settings_size },
+    .size = {
+        sizeof(struct ethtool_link_settings),
+        sizeof(struct ethtool_link_settings) },
+    .align = {
+        __alignof__(struct ethtool_link_settings),
+        __alignof__(struct ethtool_link_settings) },
+};
+
+/* struct ethtool_per_queue_op {
+ *     __u32 cmd;
+ *     __u32 sub_command;
+ *     __u32 queue_mask[__KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32)];
+ *     char  data[];
+ * };
+ *
+ * `queue_mask` are a series of bitmasks of the queues. `data` is a complete
+ * command structure for each of the queues addressed.
+ *
+ * When `cmd` is `ETHTOOL_PERQUEUE` and `sub_command` is `ETHTOOL_GCOALESCE` or
+ * `ETHTOOL_SCOALESCE`, the command structure is `struct ethtool_coalesce`.
+ */
+static void host_to_target_ethtool_per_queue_op(void *dst, const void *src)
+{
+    static const argtype ethtool_coalesce_argtype[] = {
+        MK_STRUCT(STRUCT_ethtool_coalesce), TYPE_NULL };
+    int i, queue_count;
+    struct ethtool_per_queue_op *target = dst;
+    const struct ethtool_per_queue_op *host = src;
+
+    target->cmd = tswap32(host->cmd);
+    target->sub_command = tswap32(host->sub_command);
+
+    queue_count = 0;
+    for (i = 0; i < __KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32); ++i) {
+        target->queue_mask[i] = tswap32(host->queue_mask[i]);
+        queue_count += ctpop32(host->queue_mask[i]);
+    }
+
+    if (host->cmd != ETHTOOL_PERQUEUE ||
+        (host->sub_command != ETHTOOL_GCOALESCE &&
+         host->sub_command != ETHTOOL_SCOALESCE)) {
+        fprintf(stderr,
+                "Unknown command 0x%x sub_command 0x%x for "
+                "ethtool_per_queue_op, unable to convert the `data` field "
+                "(host-to-target)\n",
+                host->cmd, host->sub_command);
+        return;
+    }
+
+    for (i = 0; i < queue_count; ++i) {
+        thunk_convert(target->data + i * sizeof(struct ethtool_coalesce),
+                      host->data + i * sizeof(struct ethtool_coalesce),
+                      ethtool_coalesce_argtype, THUNK_TARGET);
+    }
+}
+
+static void target_to_host_ethtool_per_queue_op(void *dst, const void *src)
+{
+    static const argtype ethtool_coalesce_argtype[] = {
+        MK_STRUCT(STRUCT_ethtool_coalesce), TYPE_NULL };
+    int i, queue_count;
+    struct ethtool_per_queue_op *host = dst;
+    const struct ethtool_per_queue_op *target = src;
+
+    host->cmd = tswap32(target->cmd);
+    host->sub_command = tswap32(target->sub_command);
+
+    queue_count = 0;
+    for (i = 0; i < __KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32); ++i) {
+        host->queue_mask[i] = tswap32(target->queue_mask[i]);
+        queue_count += ctpop32(host->queue_mask[i]);
+    }
+
+    if (host->cmd != ETHTOOL_PERQUEUE ||
+        (host->sub_command != ETHTOOL_GCOALESCE &&
+         host->sub_command != ETHTOOL_SCOALESCE)) {
+        fprintf(stderr,
+                "Unknown command 0x%x sub_command 0x%x for "
+                "ethtool_per_queue_op, unable to convert the `data` field "
+                "(target-to-host)\n",
+                host->cmd, host->sub_command);
+        return;
+    }
+
+    for (i = 0; i < queue_count; ++i) {
+        thunk_convert(host->data + i * sizeof(struct ethtool_coalesce),
+                      target->data + i * sizeof(struct ethtool_coalesce),
+                      ethtool_coalesce_argtype, THUNK_HOST);
+    }
+}
+
+static int target_ethtool_per_queue_op_size(const void *src)
+{
+    int i, queue_count;
+    const struct ethtool_per_queue_op *target = src;
+
+    if (tswap32(target->cmd) != ETHTOOL_PERQUEUE ||
+        (tswap32(target->sub_command) != ETHTOOL_GCOALESCE &&
+         tswap32(target->sub_command) != ETHTOOL_SCOALESCE)) {
+        fprintf(stderr,
+                "Unknown command 0x%x sub_command 0x%x for "
+                "ethtool_per_queue_op, unable to compute the size of the "
+                "`data` field (target)\n",
+                tswap32(target->cmd), tswap32(target->sub_command));
+        return sizeof(struct ethtool_per_queue_op);
+    }
+
+    queue_count = 0;
+    for (i = 0; i < __KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32); ++i) {
+        queue_count += ctpop32(target->queue_mask[i]);
+    }
+    return sizeof(struct ethtool_per_queue_op) +
+        queue_count * sizeof(struct ethtool_coalesce);
+}
+
+static int host_ethtool_per_queue_op_size(const void *src)
+{
+    int i, queue_count;
+    const struct ethtool_per_queue_op *host = src;
+
+    if (host->cmd != ETHTOOL_PERQUEUE ||
+        (host->sub_command != ETHTOOL_GCOALESCE &&
+         host->sub_command != ETHTOOL_SCOALESCE)) {
+        fprintf(stderr,
+                "Unknown command 0x%x sub_command 0x%x for "
+                "ethtool_per_queue_op, unable to compute the size of the "
+                "`data` field (host)\n",
+                host->cmd, host->sub_command);
+        return sizeof(struct ethtool_per_queue_op);
+    }
+
+    queue_count = 0;
+    for (i = 0; i < __KERNEL_DIV_ROUND_UP(MAX_NUM_QUEUE, 32); ++i) {
+        queue_count += ctpop32(host->queue_mask[i]);
+    }
+    return sizeof(struct ethtool_per_queue_op) +
+        queue_count * sizeof(struct ethtool_coalesce);
+}
+
+const StructEntry struct_ethtool_per_queue_op_def = {
+    .convert = {
+        host_to_target_ethtool_per_queue_op,
+        target_to_host_ethtool_per_queue_op
+    },
+    .thunk_size = {
+        target_ethtool_per_queue_op_size, host_ethtool_per_queue_op_size },
+    .size = {
+        sizeof(struct ethtool_per_queue_op),
+        sizeof(struct ethtool_per_queue_op) },
+    .align = {
+        __alignof__(struct ethtool_per_queue_op),
+        __alignof__(struct ethtool_per_queue_op) },
+};
+
+#define safe_dev_ethtool(fd,...) \
+    safe_syscall(__NR_ioctl, (fd), SIOCETHTOOL, __VA_ARGS__)
+
+typedef struct EthtoolEntry EthtoolEntry;
+
+typedef abi_long do_ethtool_fn(const EthtoolEntry *ee, uint8_t *buf_temp,
+                               int fd, struct ifreq *host_ifreq);
+
+struct EthtoolEntry {
+    uint32_t cmd;
+    int access;
+    do_ethtool_fn *do_ethtool;
+    const argtype arg_type[3];
+};
+
+#define ETHT_R 0x0001
+#define ETHT_W 0x0002
+#define ETHT_RW (ETHT_R | ETHT_W)
+
+static do_ethtool_fn do_ethtool_get_rxfh;
+
+static EthtoolEntry ethtool_entries[] = {
+#define ETHTOOL(cmd, access, ...) \
+    { cmd, access, 0, { __VA_ARGS__ } },
+#define ETHTOOL_SPECIAL(cmd, access, dofn, ...) \
+    { cmd, access, dofn, { __VA_ARGS__ } },
+#include "ethtool_entries.h"
+#undef ETHTOOL
+#undef ETHTOOL_SPECIAL
+    { 0, 0 },
+};
+
+/* ETHTOOL_GRSSH has two modes of operations: querying the sizes of the indir
+ * and key as well as actually querying the indir and key. When either
+ * `indir_size` or `key_size` is zero, the size of the corresponding entry is
+ * retrieved and updated into the `ethtool_rxfh` struct. When either of them is
+ * non-zero, the actually indir or key is written to `rss_config`.
+ *
+ * This causes a problem for the generic framework which converts between host
+ * and target structures without the context. When the convertion function sees
+ * an `ethtool_rxfh` struct with non-zero `indir_size` or `key_size`, it has to
+ * assume that there are entries in `rss_config` and needs to convert them.
+ * Unfortunately, when converting the returned `ethtool_rxfh` struct from host
+ * to target after an ETHTOOL_GRSSH call with the first mode, the `indir_size`
+ * and `key_size` fields are populated but there is no actual data to be
+ * converted. More importantly, user programs would not have prepared enough
+ * memory for the convertion to take place safely.
+ *
+ * ETHTOOL_GRSSH thus needs a special implementation which is aware of the two
+ * modes of operations and converts the structure accordingly.
+ */
+abi_long do_ethtool_get_rxfh(const EthtoolEntry *ee, uint8_t *buf_temp,
+                             int fd, struct ifreq *host_ifreq)
+{
+    const argtype *arg_type = ee->arg_type;
+    const abi_long ifreq_data = (abi_long)(unsigned long)host_ifreq->ifr_data;
+    struct ethtool_rxfh *rxfh = (struct ethtool_rxfh *)buf_temp;
+    uint32_t user_indir_size, user_key_size;
+    abi_long ret;
+    void *argptr;
+
+    assert(arg_type[0] == TYPE_PTR);
+    assert(ee->access == IOC_RW);
+    arg_type++;
+
+    /* As of Linux kernel v5.8-rc4, ETHTOOL_GRSSH calls never read the
+     * `rss_config` part. Converting only the "header" part suffices.
+     */
+    argptr = lock_user(VERIFY_READ, ifreq_data, sizeof(*rxfh), 1);
+    if (!argptr)
+        return -TARGET_EFAULT;
+    convert_ethtool_rxfh_header(rxfh, argptr);
+    unlock_user(argptr, ifreq_data, sizeof(*rxfh));
+
+    if (rxfh->cmd != ETHTOOL_GRSSH)
+        return -TARGET_EINVAL;
+    user_indir_size = rxfh->indir_size;
+    user_key_size = rxfh->key_size;
+
+    host_ifreq->ifr_data = (void *)rxfh;
+    ret = get_errno(safe_dev_ethtool(fd, host_ifreq));
+
+    /* When a user program supplies `indir_size` or `key_size` but does not
+     * match what the kernel has, the syscall returns EINVAL but the structure
+     * is already updated. Mimicking it here.
+     */
+    argptr = lock_user(VERIFY_WRITE, ifreq_data, sizeof(*rxfh), 0);
+    if (!argptr)
+        return -TARGET_EFAULT;
+    convert_ethtool_rxfh_header(argptr, rxfh);
+    unlock_user(argptr, ifreq_data, 0);
+
+    if (is_error(ret))
+        return ret;
+
+    if (user_indir_size > 0 || user_key_size > 0) {
+        const int rss_config_size =
+            user_indir_size * sizeof(rxfh->rss_config[0]) + user_key_size;
+        argptr = lock_user(VERIFY_WRITE, ifreq_data + sizeof(*rxfh),
+                           rss_config_size, 0);
+        if (!argptr)
+            return -TARGET_EFAULT;
+        convert_ethtool_rxfh_rss_config(argptr, rxfh->rss_config,
+                                        user_indir_size, user_key_size);
+        unlock_user(argptr, ifreq_data + sizeof(*rxfh), rss_config_size);
+    }
+    return ret;
+}
+
+/* Calculates the size of the data type represented by `type_ptr` with
+ * `guest_addr` being the underlying memory. Since `type_ptr` may contain
+ * flexible arrays, we need access to the underlying memory to determine their
+ * sizes.
+ */
+static int thunk_size(abi_long guest_addr, const argtype *type_ptr)
+{
+    /* lock_user based on `thunk_type_size` then call `thunk_type_size_with_src`
+     * on it.
+     */
+    void *src;
+    int type_size = thunk_type_size(type_ptr, /*is_host=*/0);
+    if (!thunk_type_has_flexible_array(type_ptr)) {
+        return type_size;
+    }
+
+    src = lock_user(VERIFY_READ, guest_addr, type_size, 0);
+    type_size = thunk_type_size_with_src(src, type_ptr, /*is_host=*/0);
+    unlock_user(src, guest_addr, 0);
+
+    return type_size;
+}
+
+abi_long dev_ethtool(int fd, uint8_t *buf_temp)
+{
+    uint32_t *cmd;
+    uint32_t host_cmd;
+    const EthtoolEntry *ee;
+    const argtype *arg_type;
+    abi_long ret;
+    int target_size;
+    void *argptr;
+
+    /* Make a copy of `host_ifreq` because we are going to reuse `buf_temp` and
+     * overwrite it. Further, we will overwrite `host_ifreq.ifreq_data`, so
+     * keep a copy in `ifreq_data`.
+     */
+    struct ifreq host_ifreq = *(struct ifreq *)(unsigned long)buf_temp;
+    const abi_long ifreq_data = (abi_long)(unsigned long)host_ifreq.ifr_data;
+
+    cmd = (uint32_t *)lock_user(VERIFY_READ, ifreq_data, sizeof(uint32_t), 0);
+    host_cmd = tswap32(*cmd);
+    unlock_user(cmd, ifreq_data, 0);
+
+    ee = ethtool_entries;
+    for(;;) {
+        if (ee->cmd == 0) {
+            qemu_log_mask(LOG_UNIMP, "Unsupported ethtool cmd=0x%04lx\n",
+                          (long)host_cmd);
+            return -TARGET_ENOSYS;
+        }
+        if (ee->cmd == host_cmd) {
+            break;
+        }
+        ee++;
+    }
+    if (ee->do_ethtool) {
+        return ee->do_ethtool(ee, buf_temp, fd, &host_ifreq);
+    }
+
+    host_ifreq.ifr_data = buf_temp;
+    /* Even for ETHT_R, cmd still needs to be copied. */
+    *(uint32_t *)buf_temp = host_cmd;
+
+    arg_type = ee->arg_type;
+    switch (arg_type[0]) {
+    case TYPE_NULL:
+        /* no argument other than cmd */
+        ret = get_errno(safe_dev_ethtool(fd, &host_ifreq));
+        break;
+    case TYPE_PTR:
+        arg_type++;
+        target_size = thunk_size(ifreq_data, arg_type);
+        switch (ee->access) {
+        case ETHT_R:
+            ret = get_errno(safe_dev_ethtool(fd, &host_ifreq));
+            if (!is_error(ret)) {
+                argptr = lock_user(VERIFY_WRITE, ifreq_data, target_size, 0);
+                if (!argptr) {
+                    return -TARGET_EFAULT;
+                }
+                thunk_convert(argptr, buf_temp, arg_type, THUNK_TARGET);
+                unlock_user(argptr, ifreq_data, target_size);
+            }
+            break;
+        case ETHT_W:
+            argptr = lock_user(VERIFY_READ, ifreq_data, target_size, 1);
+            if (!argptr) {
+                return -TARGET_EFAULT;
+            }
+            thunk_convert(buf_temp, argptr, arg_type, THUNK_HOST);
+            unlock_user(argptr, ifreq_data, 0);
+            ret = get_errno(safe_dev_ethtool(fd, &host_ifreq));
+            break;
+        default:
+        case ETHT_RW:
+            argptr = lock_user(VERIFY_READ, ifreq_data, target_size, 1);
+            if (!argptr)
+                return -TARGET_EFAULT;
+            thunk_convert(buf_temp, argptr, arg_type, THUNK_HOST);
+            unlock_user(argptr, ifreq_data, 0);
+            ret = get_errno(safe_dev_ethtool(fd, &host_ifreq));
+            if (!is_error(ret)) {
+                argptr = lock_user(VERIFY_WRITE, ifreq_data, target_size, 0);
+                if (!argptr) {
+                    return -TARGET_EFAULT;
+                }
+                thunk_convert(argptr, buf_temp, arg_type, THUNK_TARGET);
+                unlock_user(argptr, ifreq_data, target_size);
+            }
+            break;
+        }
+        break;
+    default:
+        qemu_log_mask(LOG_UNIMP,
+                      "Unsupported ethtool type: cmd=0x%04lx type=%d\n",
+                      (long)host_cmd, arg_type[0]);
+        ret = -TARGET_ENOSYS;
+        break;
+    }
+    return ret;
+}
diff --git a/linux-user/ethtool.h b/linux-user/ethtool.h
new file mode 100644
index 0000000000..24de5dccb3
--- /dev/null
+++ b/linux-user/ethtool.h
@@ -0,0 +1,19 @@
+#ifndef ETHTOOL_H
+#define ETHTOOL_H
+
+#include <linux/if.h>
+#include "qemu.h"
+
+extern const StructEntry struct_ethtool_rxnfc_get_set_rxfh_def;
+extern const StructEntry struct_ethtool_sset_info_def;
+extern const StructEntry struct_ethtool_rxfh_def;
+extern const StructEntry struct_ethtool_link_settings_def;
+extern const StructEntry struct_ethtool_per_queue_op_def;
+
+/* Takes the file descriptor and the buffer for temporarily storing data read
+ * from / to be written to guest memory. `buf_temp` must now contain the host
+ * representation of `struct ifreq`.
+ */
+abi_long dev_ethtool(int fd, uint8_t *buf_temp);
+
+#endif /* ETHTOOL_H */
diff --git a/linux-user/ethtool_entries.h b/linux-user/ethtool_entries.h
new file mode 100644
index 0000000000..14f4e80a21
--- /dev/null
+++ b/linux-user/ethtool_entries.h
@@ -0,0 +1,107 @@
+  ETHTOOL(ETHTOOL_GSET, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_cmd)))
+  ETHTOOL(ETHTOOL_SSET, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_cmd)))
+  ETHTOOL(ETHTOOL_GDRVINFO, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_drvinfo)))
+  ETHTOOL(ETHTOOL_GREGS, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_regs)))
+  ETHTOOL(ETHTOOL_GWOL, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_wolinfo)))
+  ETHTOOL(ETHTOOL_SWOL, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_wolinfo)))
+  ETHTOOL(ETHTOOL_GMSGLVL, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SMSGLVL, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GEEE, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_eee)))
+  ETHTOOL(ETHTOOL_SEEE, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_eee)))
+  ETHTOOL(ETHTOOL_NWAY_RST, 0, TYPE_NULL)
+  ETHTOOL(ETHTOOL_GLINK, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GEEPROM, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_eeprom)))
+  ETHTOOL(ETHTOOL_SEEPROM, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_eeprom)))
+  ETHTOOL(ETHTOOL_GCOALESCE, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_coalesce)))
+  ETHTOOL(ETHTOOL_SCOALESCE, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_coalesce)))
+  ETHTOOL(ETHTOOL_GRINGPARAM, ETHT_R,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_ringparam)))
+  ETHTOOL(ETHTOOL_SRINGPARAM, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_ringparam)))
+  ETHTOOL(ETHTOOL_GPAUSEPARAM, ETHT_R,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_pauseparam)))
+  ETHTOOL(ETHTOOL_SPAUSEPARAM, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_pauseparam)))
+  ETHTOOL(ETHTOOL_TEST, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_test)))
+  ETHTOOL(ETHTOOL_GSTRINGS, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_gstrings)))
+  ETHTOOL(ETHTOOL_PHYS_ID, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GSTATS, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_stats)))
+  ETHTOOL(ETHTOOL_GPERMADDR, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_perm_addr)))
+  ETHTOOL(ETHTOOL_GFLAGS, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SFLAGS, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GPFLAGS, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SPFLAGS, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GRXFH, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_get_set_rxfh)))
+  ETHTOOL(ETHTOOL_GRXRINGS, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_rss_context)))
+  ETHTOOL(ETHTOOL_GRXCLSRLCNT, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_rule_cnt)))
+  ETHTOOL(ETHTOOL_GRXCLSRULE, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_rss_context)))
+  ETHTOOL(ETHTOOL_GRXCLSRLALL, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_rule_locs)))
+  ETHTOOL(ETHTOOL_SRXFH, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_get_set_rxfh)))
+  ETHTOOL(ETHTOOL_SRXCLSRLDEL, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_rss_context)))
+  ETHTOOL(ETHTOOL_SRXCLSRLINS, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxnfc_rss_context)))
+  ETHTOOL(ETHTOOL_FLASHDEV, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_flash)))
+  ETHTOOL(ETHTOOL_RESET, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GSSET_INFO, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_sset_info)))
+  ETHTOOL(ETHTOOL_GRXFHINDIR, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxfh_indir)))
+  ETHTOOL(ETHTOOL_SRXFHINDIR, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxfh_indir)))
+  ETHTOOL_SPECIAL(ETHTOOL_GRSSH, ETHT_RW, do_ethtool_get_rxfh,
+                  MK_PTR(MK_STRUCT(STRUCT_ethtool_rxfh)))
+  ETHTOOL(ETHTOOL_SRSSH, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_rxfh)))
+  ETHTOOL(ETHTOOL_GFEATURES, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_gfeatures)))
+  ETHTOOL(ETHTOOL_SFEATURES, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_sfeatures)))
+  ETHTOOL(ETHTOOL_GTXCSUM, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GRXCSUM, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GSG, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GTSO, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GGSO, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GGRO, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_STXCSUM, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SRXCSUM, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SSG, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_STSO, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SGSO, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_SGRO, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_value)))
+  ETHTOOL(ETHTOOL_GCHANNELS, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_channels)))
+  ETHTOOL(ETHTOOL_SCHANNELS, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_channels)))
+  ETHTOOL(ETHTOOL_SET_DUMP, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_dump_no_data)))
+  ETHTOOL(ETHTOOL_GET_DUMP_FLAG, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_dump_no_data)))
+  ETHTOOL(ETHTOOL_GET_DUMP_DATA, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_dump)))
+  ETHTOOL(ETHTOOL_GET_TS_INFO, ETHT_R,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_ts_info)))
+  ETHTOOL(ETHTOOL_GMODULEINFO, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_modinfo)))
+  ETHTOOL(ETHTOOL_GMODULEEEPROM, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_eeprom)))
+  ETHTOOL(ETHTOOL_GTUNABLE, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_tunable)))
+  ETHTOOL(ETHTOOL_STUNABLE, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_tunable)))
+  ETHTOOL(ETHTOOL_GPHYSTATS, ETHT_RW, MK_PTR(MK_STRUCT(STRUCT_ethtool_stats)))
+  ETHTOOL(ETHTOOL_PERQUEUE, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_per_queue_op)))
+  ETHTOOL(ETHTOOL_GLINKSETTINGS, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_link_settings)))
+  ETHTOOL(ETHTOOL_SLINKSETTINGS, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_link_settings)))
+  ETHTOOL(ETHTOOL_PHY_GTUNABLE, ETHT_RW,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_tunable)))
+  ETHTOOL(ETHTOOL_PHY_STUNABLE, ETHT_W,
+          MK_PTR(MK_STRUCT(STRUCT_ethtool_tunable)))
+  ETHTOOL(ETHTOOL_GFECPARAM, ETHT_R, MK_PTR(MK_STRUCT(STRUCT_ethtool_fecparam)))
+  ETHTOOL(ETHTOOL_GFECPARAM, ETHT_W, MK_PTR(MK_STRUCT(STRUCT_ethtool_fecparam)))
diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 0713ae1311..fd6ac963ec 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -238,6 +238,8 @@
   IOCTL(SIOCSIFHWADDR, IOC_W, MK_PTR(MK_STRUCT(STRUCT_sockaddr_ifreq)))
   IOCTL(SIOCGIFTXQLEN, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_sockaddr_ifreq)))
   IOCTL(SIOCSIFTXQLEN, IOC_W, MK_PTR(MK_STRUCT(STRUCT_sockaddr_ifreq)))
+  IOCTL_SPECIAL(SIOCETHTOOL, IOC_W | IOC_R, do_ioctl_ethtool,
+                MK_PTR(MK_STRUCT(STRUCT_ptr_ifreq)))
   IOCTL(SIOCGIFMETRIC, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_int_ifreq)))
   IOCTL(SIOCSIFMETRIC, IOC_W, MK_PTR(MK_STRUCT(STRUCT_int_ifreq)))
   IOCTL(SIOCGIFMTU, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_int_ifreq)))
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 5c964389c1..43f00681f8 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -231,6 +231,7 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
                     abi_long arg2, abi_long arg3, abi_long arg4,
                     abi_long arg5, abi_long arg6, abi_long arg7,
                     abi_long arg8);
+abi_long get_errno(abi_long ret);
 extern __thread CPUState *thread_cpu;
 void cpu_loop(CPUArchState *env);
 const char *target_strerror(int err);
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 483f8ce48f..bbdcc5d4db 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -127,6 +127,7 @@
 #include "qapi/error.h"
 #include "fd-trans.h"
 #include "tcg/tcg.h"
+#include "ethtool.h"
 
 #ifndef CLONE_IO
 #define CLONE_IO                0x80000000      /* Clone io context */
@@ -683,7 +684,7 @@ static inline int target_to_host_errno(int err)
     return err;
 }
 
-static inline abi_long get_errno(abi_long ret)
+abi_long get_errno(abi_long ret)
 {
     if (ret == -1)
         return -host_to_target_errno(errno);
@@ -4681,16 +4682,6 @@ static abi_long do_ipc(CPUArchState *cpu_env,
 #endif
 
 /* kernel structure types definitions */
-
-#define STRUCT(name, ...) STRUCT_ ## name,
-#define STRUCT_SPECIAL(name) STRUCT_ ## name,
-enum {
-#include "syscall_types.h"
-STRUCT_MAX
-};
-#undef STRUCT
-#undef STRUCT_SPECIAL
-
 #define STRUCT(name, ...) static const argtype struct_ ## name ## _def[] = {  __VA_ARGS__, TYPE_NULL };
 #define STRUCT_SPECIAL(name)
 #include "syscall_types.h"
@@ -4788,6 +4779,28 @@ static abi_long do_ioctl_fs_ioc_fiemap(const IOCTLEntry *ie, uint8_t *buf_temp,
 }
 #endif
 
+static abi_long do_ioctl_ethtool(const IOCTLEntry *ie, uint8_t *buf_temp,
+                                int fd, int cmd, abi_long arg)
+{
+    const argtype *arg_type = ie->arg_type;
+    int target_size;
+    void *argptr;
+
+    assert(arg_type[0] == TYPE_PTR);
+    assert(ie->access == IOC_RW);
+
+    arg_type++;
+    target_size = thunk_type_size(arg_type, 0);
+
+    argptr = lock_user(VERIFY_READ, arg, target_size, 1);
+    if (!argptr)
+        return -TARGET_EFAULT;
+    thunk_convert(buf_temp, argptr, arg_type, THUNK_HOST);
+    unlock_user(argptr, arg, target_size);
+
+    return dev_ethtool(fd, buf_temp);
+}
+
 static abi_long do_ioctl_ifconf(const IOCTLEntry *ie, uint8_t *buf_temp,
                                 int fd, int cmd, abi_long arg)
 {
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 70df1a94fb..5d344d5ffa 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -866,6 +866,8 @@ struct target_rtc_pll_info {
 #define TARGET_SIOCGIFTXQLEN   0x8942          /* Get the tx queue length      */
 #define TARGET_SIOCSIFTXQLEN   0x8943          /* Set the tx queue length      */
 
+#define TARGET_SIOCETHTOOL     0x8946          /* Ethtool interface            */
+
 /* ARP cache control calls. */
 #define TARGET_OLD_SIOCDARP    0x8950          /* old delete ARP table entry   */
 #define TARGET_OLD_SIOCGARP    0x8951          /* old get ARP table entry      */
@@ -2776,4 +2778,14 @@ struct target_statx {
    /* 0x100 */
 };
 
+/* kernel structure types definitions */
+#define STRUCT(name, ...) STRUCT_ ## name,
+#define STRUCT_SPECIAL(name) STRUCT_ ## name,
+enum {
+#include "syscall_types.h"
+STRUCT_MAX
+};
+#undef STRUCT
+#undef STRUCT_SPECIAL
+
 #endif
diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
index 3f1f033464..e920086e66 100644
--- a/linux-user/syscall_types.h
+++ b/linux-user/syscall_types.h
@@ -1,3 +1,4 @@
+
 STRUCT_SPECIAL(termios)
 
 STRUCT(winsize,
@@ -464,3 +465,279 @@ STRUCT(usbdevfs_disconnect_claim,
         TYPE_INT, /* flags */
         MK_ARRAY(TYPE_CHAR, USBDEVFS_MAXDRIVERNAME + 1)) /* driver */
 #endif /* CONFIG_USBFS */
+
+/* ethtool ioctls */
+STRUCT(ethtool_cmd,
+       TYPE_INT,   /* cmd */
+       TYPE_INT,   /* supported */
+       TYPE_INT,   /* advertising */
+       TYPE_SHORT, /* speed */
+       TYPE_CHAR,  /* duplex */
+       TYPE_CHAR,  /* port */
+       TYPE_CHAR,  /* phy_address */
+       TYPE_CHAR,  /* transceiver */
+       TYPE_CHAR,  /* autoneg */
+       TYPE_CHAR,  /* mdio_support */
+       TYPE_INT,   /* maxtxpkt */
+       TYPE_INT,   /* maxrxpkt */
+       TYPE_SHORT, /* speed_hi */
+       TYPE_CHAR,  /* eth_tp_mdix */
+       TYPE_CHAR,  /* eth_tp_mdix_ctrl */
+       TYPE_INT,   /* lp_advertising */
+       MK_ARRAY(TYPE_INT, 2)) /* reserved */
+
+STRUCT(ethtool_drvinfo,
+       TYPE_INT, /* cmd */
+       MK_ARRAY(TYPE_CHAR, 32), /* driver */
+       MK_ARRAY(TYPE_CHAR, 32), /* version */
+       MK_ARRAY(TYPE_CHAR, 32), /* fw_version[ETHTOOL_FWVERS_LEN] */
+       MK_ARRAY(TYPE_CHAR, 32), /* bus_info[ETHTOOL_BUSINFO_LEN] */
+       MK_ARRAY(TYPE_CHAR, 32), /* erom_version[ETHTOOL_EROMVERS_LEN] */
+       MK_ARRAY(TYPE_CHAR, 12), /* reserved2 */
+       TYPE_INT, /* n_priv_flags */
+       TYPE_INT, /* n_stats */
+       TYPE_INT, /* testinfo_len */
+       TYPE_INT, /* eedump_len */
+       TYPE_INT) /* regdump_len */
+
+STRUCT(ethtool_regs,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* version */
+       TYPE_INT, /* len */
+       MK_FLEXIBLE_ARRAY(TYPE_CHAR, 2)) /* data[0]: len */
+
+STRUCT(ethtool_wolinfo,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* supported */
+       TYPE_INT, /* wolopts */
+       MK_ARRAY(TYPE_CHAR,  6)) /* sopass[SOPASS_MAX] */
+
+STRUCT(ethtool_value,
+       TYPE_INT, /* cmd */
+       TYPE_INT) /* data */
+
+STRUCT(ethtool_eee,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* supported */
+       TYPE_INT, /* advertised */
+       TYPE_INT, /* lp_advertised */
+       TYPE_INT, /* eee_active */
+       TYPE_INT, /* eee_enabled */
+       TYPE_INT, /* tx_lpi_enabled */
+       TYPE_INT, /* tx_lpi_timer */
+       MK_ARRAY(TYPE_INT, 2)) /* reserved */
+
+STRUCT(ethtool_eeprom,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* magic */
+       TYPE_INT, /* offset */
+       TYPE_INT, /* len */
+       MK_FLEXIBLE_ARRAY(TYPE_CHAR, 3)) /* data[0]: len */
+
+STRUCT(ethtool_coalesce,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* rx_coalesce_usecs */
+       TYPE_INT, /* rx_max_coalesced_frames */
+       TYPE_INT, /* rx_coalesce_usecs_irq */
+       TYPE_INT, /* rx_max_coalesced_frames_irq */
+       TYPE_INT, /* tx_coalesce_usecs */
+       TYPE_INT, /* tx_max_coalesced_frames */
+       TYPE_INT, /* tx_coalesce_usecs_irq */
+       TYPE_INT, /* tx_max_coalesced_frames_irq */
+       TYPE_INT, /* stats_block_coalesce_usecs */
+       TYPE_INT, /* use_adaptive_rx_coalesce */
+       TYPE_INT, /* use_adaptive_tx_coalesce */
+       TYPE_INT, /* pkt_rate_low */
+       TYPE_INT, /* rx_coalesce_usecs_low */
+       TYPE_INT, /* rx_max_coalesced_frames_low */
+       TYPE_INT, /* tx_coalesce_usecs_low */
+       TYPE_INT, /* tx_max_coalesced_frames_low */
+       TYPE_INT, /* pkt_rate_high */
+       TYPE_INT, /* rx_coalesce_usecs_high */
+       TYPE_INT, /* rx_max_coalesced_frames_high */
+       TYPE_INT, /* tx_coalesce_usecs_high */
+       TYPE_INT, /* tx_max_coalesced_frames_high */
+       TYPE_INT) /* rate_sample_interval */
+
+STRUCT(ethtool_ringparam,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* rx_max_pending */
+       TYPE_INT, /* rx_mini_max_pending */
+       TYPE_INT, /* rx_jumbo_max_pending */
+       TYPE_INT, /* tx_max_pending */
+       TYPE_INT, /* rx_pending */
+       TYPE_INT, /* rx_mini_pending */
+       TYPE_INT, /* rx_jumbo_pending */
+       TYPE_INT) /* tx_pending */
+
+STRUCT(ethtool_pauseparam,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* autoneg */
+       TYPE_INT, /* rx_pause */
+       TYPE_INT) /* tx_pause */
+
+STRUCT(ethtool_test,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* flags */
+       TYPE_INT, /* reserved */
+       TYPE_INT, /* len */
+       MK_FLEXIBLE_ARRAY(TYPE_LONGLONG, 3)) /* data[0]: len */
+
+STRUCT(ethtool_gstrings,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* string_set */
+       TYPE_INT, /* len */
+       MK_FLEXIBLE_ARRAY(MK_ARRAY(TYPE_CHAR, 32), 2))
+       /* data[0]: len * ETH_GSTRING_LEN */
+
+STRUCT(ethtool_stats,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* n_stats */
+       MK_FLEXIBLE_ARRAY(TYPE_LONGLONG, 1)) /* data[0]: n_stats */
+
+STRUCT(ethtool_perm_addr,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* size */
+       MK_FLEXIBLE_ARRAY(TYPE_CHAR, 1)) /* data[0]: size */
+
+STRUCT(ethtool_flow_ext,
+       MK_ARRAY(TYPE_CHAR, 2), /* padding */
+       MK_ARRAY(TYPE_CHAR, 6), /* h_dest[ETH_ALEN] */
+       MK_ARRAY(TYPE_CHAR, 2), /* __be16 vlan_etype */
+       MK_ARRAY(TYPE_CHAR, 2), /* __be16 vlan_tci */
+       MK_ARRAY(TYPE_CHAR, 8)) /* __be32 data[2] */
+
+/* Union ethtool_flow_union contains alternatives that are either struct that
+ * only uses __be* types or char/__u8, or "__u8 hdata[52]". We can treat it as
+ * byte array in all cases.
+ */
+STRUCT(ethtool_rx_flow_spec,
+       TYPE_INT,                           /* flow_type */
+       MK_ARRAY(TYPE_CHAR, 52),            /* union ethtool_flow_union h_u */
+       MK_STRUCT(STRUCT_ethtool_flow_ext), /* h_ext */
+       MK_ARRAY(TYPE_CHAR, 52),            /* union ethtool_flow_union m_u */
+       MK_STRUCT(STRUCT_ethtool_flow_ext), /* m_ext */
+       TYPE_LONGLONG,                      /* ring_cookie */
+       TYPE_INT)                           /* location */
+
+STRUCT(ethtool_rxnfc_rss_context,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* flow_type */
+       TYPE_LONGLONG, /* data */
+       MK_STRUCT(STRUCT_ethtool_rx_flow_spec), /* fs */
+       TYPE_INT) /* rss_context */
+
+STRUCT(ethtool_rxnfc_rule_cnt,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* flow_type */
+       TYPE_LONGLONG, /* data */
+       MK_STRUCT(STRUCT_ethtool_rx_flow_spec), /* fs */
+       TYPE_INT) /* rss_cnt */
+
+STRUCT(ethtool_rxnfc_rule_locs,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* flow_type */
+       TYPE_LONGLONG, /* data */
+       MK_STRUCT(STRUCT_ethtool_rx_flow_spec), /* fs */
+       TYPE_INT, /* rss_cnt */
+       MK_FLEXIBLE_ARRAY(TYPE_INT, 4)) /* rule_locs[0]: rss_cnt */
+
+/* For ETHTOOL_{G,S}RXFH, originally only the first three fields are defined,
+ * but with certain options, more fields are used.
+ */
+STRUCT_SPECIAL(ethtool_rxnfc_get_set_rxfh)
+
+STRUCT(ethtool_flash,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* region */
+       MK_ARRAY(TYPE_CHAR, 128)) /* data[ETHTOOL_FLASH_MAX_FILENAME] */
+
+STRUCT_SPECIAL(ethtool_sset_info)
+
+STRUCT(ethtool_rxfh_indir,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* size */
+       MK_FLEXIBLE_ARRAY(TYPE_INT, 1)) /* ring_index[0]: size */
+
+STRUCT_SPECIAL(ethtool_rxfh)
+
+STRUCT(ethtool_get_features_block,
+       TYPE_INT, /* available */
+       TYPE_INT, /* requested */
+       TYPE_INT, /* active */
+       TYPE_INT) /* never_changed */
+
+STRUCT(ethtool_gfeatures,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* size */
+       MK_FLEXIBLE_ARRAY(MK_STRUCT(STRUCT_ethtool_get_features_block), 1))
+       /* features[0]: size */
+
+STRUCT(ethtool_set_features_block,
+       TYPE_INT, /* valid */
+       TYPE_INT) /* requested */
+
+STRUCT(ethtool_sfeatures,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* size */
+       MK_FLEXIBLE_ARRAY(MK_STRUCT(STRUCT_ethtool_set_features_block), 1))
+       /* features[0]: size */
+
+STRUCT(ethtool_channels,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* max_rx */
+       TYPE_INT, /* max_tx */
+       TYPE_INT, /* max_other */
+       TYPE_INT, /* max_combined */
+       TYPE_INT, /* rx_count */
+       TYPE_INT, /* tx_count */
+       TYPE_INT, /* other_count */
+       TYPE_INT) /* combined_count */
+
+/* For ETHTOOL_SET_DUMP and ETHTOOL_GET_DUMP_FLAG, the flexible array `data` is
+ * not used.
+ */
+STRUCT(ethtool_dump_no_data,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* version */
+       TYPE_INT, /* flag */
+       TYPE_INT) /* len */
+
+STRUCT(ethtool_dump,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* version */
+       TYPE_INT, /* flag */
+       TYPE_INT, /* len */
+       MK_FLEXIBLE_ARRAY(TYPE_CHAR, 3)) /* data[0]: len */
+
+STRUCT(ethtool_ts_info,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* so_timestamping */
+       TYPE_INT, /* phc_index */
+       TYPE_INT, /* tx_types */
+       MK_ARRAY(TYPE_INT, 3), /* tx_reserved */
+       TYPE_INT, /* rx_filters */
+       MK_ARRAY(TYPE_INT, 3)) /* rx_reserved */
+
+STRUCT(ethtool_modinfo,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* type */
+       TYPE_INT, /* eeprom_len */
+       MK_ARRAY(TYPE_INT, 8)) /* reserved */
+
+STRUCT(ethtool_tunable,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* id */
+       TYPE_INT, /* type_id */
+       TYPE_INT, /* len */
+       MK_FLEXIBLE_ARRAY(TYPE_PTRVOID, 3)) /* data[0]: len */
+
+STRUCT_SPECIAL(ethtool_link_settings)
+
+STRUCT(ethtool_fecparam,
+       TYPE_INT, /* cmd */
+       TYPE_INT, /* active_fec */
+       TYPE_INT, /* fec */
+       TYPE_INT) /* reserved */
+
+STRUCT_SPECIAL(ethtool_per_queue_op)
diff --git a/tests/tcg/multiarch/ethtool.c b/tests/tcg/multiarch/ethtool.c
new file mode 100644
index 0000000000..54b2337aa7
--- /dev/null
+++ b/tests/tcg/multiarch/ethtool.c
@@ -0,0 +1,417 @@
+#include <asm-generic/errno.h>
+#include <assert.h>
+#include <errno.h>
+#include <inttypes.h>
+#include <linux/ethtool.h>
+#include <linux/if.h>
+#include <linux/sockios.h>
+#include <netinet/in.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/ioctl.h>
+#include <sys/socket.h>
+
+const int number_of_entries_to_print = 10;
+const uint32_t protected_memory_pattern[] = {
+      0xdeadc0de, 0xb0bb1e, 0xfacade, 0xfeeb1e };
+
+static void fail_with(const char *action, const char *cmd_name, int cmd, int err) {
+    if (errno == EOPNOTSUPP) {
+        printf("Unsupported operation: %s; errno = %d: %s.\n"
+               "TEST SKIPPED (%s = 0x%x).\n",
+               action, err, strerror(err), cmd_name, cmd);
+        return;
+    }
+    if (err) {
+        fprintf(stderr,
+                "Failed to %s (%s = 0x%x): errno = %d: %s\n",
+                action, cmd_name, cmd, err, strerror(err));
+    } else {
+        fprintf(stderr,
+                "Failed to %s (%s = 0x%x): no errno\n",
+                action, cmd_name, cmd);
+    }
+    exit(err);
+}
+#define FAIL(action,cmd) fail_with(action, #cmd, cmd, errno)
+
+/* `calloc_protected` and `protected_memory_changed` can be used to verify that
+ * a system call does not write pass intended memory boundary.
+ *
+ * `ptr = calloc_protected(n)` will allocate extra memory after `n` bytes and
+ * populate it with a memory pattern. The first `n` bytes are still guaranteed
+ * to be zeroed out like `calloc(1, n)`. `protected_memory_changed(ptr, n)`
+ * takes the pointer and the original size `n` and checks that the memory
+ * pattern is intact.
+ */
+uint8_t *calloc_protected(size_t struct_size)
+{
+    uint8_t *buf = (uint8_t *) calloc(
+        1,
+        struct_size + sizeof(protected_memory_pattern));
+    memcpy(buf + struct_size, protected_memory_pattern,
+           sizeof(protected_memory_pattern));
+    return buf;
+}
+
+bool protected_memory_changed(const uint8_t *ptr, size_t struct_size)
+{
+    return memcmp(ptr + struct_size, protected_memory_pattern,
+                  sizeof(protected_memory_pattern)) != 0;
+}
+
+void print_entries(const char *fmt, int len, uint32_t *entries)
+{
+    int i;
+    for (i = 0; i < len && i < number_of_entries_to_print; ++i) {
+        printf(fmt, entries[i]);
+    }
+    if (len > number_of_entries_to_print) {
+        printf(" (%d more omitted)", len - number_of_entries_to_print);
+    }
+}
+
+void basic_test(int socketfd, struct ifreq ifr)
+{
+   struct ethtool_drvinfo drvinfo;
+   drvinfo.cmd = ETHTOOL_GDRVINFO;
+   ifr.ifr_data = (void *)&drvinfo;
+   if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+       FAIL("get driver info", ETHTOOL_GDRVINFO);
+       return;
+   }
+   printf("Driver: %s (version %s)\n", drvinfo.driver, drvinfo.version);
+}
+
+/* Test flexible array. */
+void test_get_stats(int socketfd, struct ifreq ifr, int n_stats)
+{
+    int i;
+    struct ethtool_stats *stats = (struct ethtool_stats *)calloc(
+        1, sizeof(*stats) + sizeof(stats->data[0]) * n_stats);
+    stats->cmd = ETHTOOL_GSTATS;
+    stats->n_stats = n_stats;
+    ifr.ifr_data = (void *)stats;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get statastics", ETHTOOL_GSTATS);
+        free(stats);
+        return;
+    }
+    if (stats->n_stats != n_stats) {
+        FAIL("get consistent number of statistics", ETHTOOL_GSTATS);
+    }
+    for (i = 0; i < stats->n_stats && i < number_of_entries_to_print; ++i) {
+        printf("stats[%d] = %llu\n", i, (unsigned long long)stats->data[i]);
+    }
+    if (stats->n_stats > number_of_entries_to_print) {
+        printf("(%d more omitted)\n",
+               stats->n_stats - number_of_entries_to_print);
+    }
+    free(stats);
+}
+
+/* Test flexible array with char array as elements. */
+void test_get_strings(int socketfd, struct ifreq ifr, int n_stats)
+{
+    int i;
+    struct ethtool_gstrings *gstrings =
+        (struct ethtool_gstrings *)calloc(
+            1, sizeof(*gstrings) + ETH_GSTRING_LEN * n_stats);
+    gstrings->cmd = ETHTOOL_GSTRINGS;
+    gstrings->string_set = ETH_SS_STATS;
+    gstrings->len = n_stats;
+    ifr.ifr_data = (void *)gstrings;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get string set", ETHTOOL_GSTRINGS);
+        free(gstrings);
+        return;
+    }
+    if (gstrings->len != n_stats) {
+        FAIL("get consistent number of statistics", ETHTOOL_GSTRINGS);
+    }
+    for (i = 0; i < gstrings->len && i < number_of_entries_to_print; ++i) {
+        printf("stat_names[%d] = %.*s\n",
+               i, ETH_GSTRING_LEN, gstrings->data + i * ETH_GSTRING_LEN);
+    }
+    if (gstrings->len > number_of_entries_to_print) {
+        printf("(%d more omitted)\n",
+               gstrings->len - number_of_entries_to_print);
+    }
+    free(gstrings);
+}
+
+/* Testing manual implementation of converting `struct ethtool_sset_info`, also
+ * info for subsequent tests.
+ */
+int test_get_sset_info(int socketfd, struct ifreq ifr)
+{
+    const int n_sset = 2;
+    int n_stats;
+    struct ethtool_sset_info *sset_info =
+        (struct ethtool_sset_info *)calloc(
+            1, sizeof(*sset_info) + sizeof(sset_info->data[0]) * n_sset);
+    sset_info->cmd = ETHTOOL_GSSET_INFO;
+    sset_info->sset_mask = 1 << ETH_SS_TEST | 1 << ETH_SS_STATS;
+    assert(__builtin_popcount(sset_info->sset_mask) == n_sset);
+    ifr.ifr_data = (void *)sset_info;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        fail_with("get string set info", "ETHTOOL_GSSET_INFO",
+                  ETHTOOL_GSSET_INFO, errno);
+        free(sset_info);
+        return 0;
+    }
+    if ((sset_info->sset_mask & (1 << ETH_SS_STATS)) == 0) {
+        puts("No stats string set info, SKIPPING dependent tests");
+        free(sset_info);
+        return 0;
+    }
+    n_stats = (sset_info->sset_mask & (1 << ETH_SS_TEST)) ?
+        sset_info->data[1] :
+        sset_info->data[0];
+    printf("n_stats = %d\n", n_stats);
+    free(sset_info);
+    return n_stats;
+}
+
+/* Test manual implementation of converting `struct ethtool_rxnfc`, focusing on
+ * the case where only the first three fields are present. (The original struct
+ * definition.)
+ */
+void test_get_rxfh(int socketfd, struct ifreq ifr)
+{
+    struct ethtool_rxnfc *rxnfc;
+    const int rxnfc_first_three_field_size =
+        sizeof(rxnfc->cmd) + sizeof(rxnfc->flow_type) + sizeof(rxnfc->data);
+    rxnfc = (struct ethtool_rxnfc *)calloc_protected(
+        rxnfc_first_three_field_size);
+    rxnfc->cmd = ETHTOOL_GRXFH;
+    rxnfc->flow_type = TCP_V4_FLOW;
+    ifr.ifr_data = (void *)rxnfc;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get RX flow classification rules", ETHTOOL_GRXFH);
+        free(rxnfc);
+        return;
+    }
+    if (protected_memory_changed((const uint8_t *)rxnfc,
+                                 rxnfc_first_three_field_size)) {
+        FAIL("preserve memory after the first three fields", ETHTOOL_GRXFH);
+    }
+    printf("Flow hash bitmask (flow_type = TCP v4): 0x%llx\n",
+           (unsigned long long)rxnfc->data);
+    free(rxnfc);
+}
+
+/* Test manual implementation of converting `struct ethtool_link_settings`. */
+void test_get_link_settings(int socketfd, struct ifreq ifr)
+{
+    int link_mode_masks_nwords;
+    struct ethtool_link_settings *link_settings_header =
+        (struct ethtool_link_settings *) calloc_protected(
+            sizeof(*link_settings_header));
+    link_settings_header->cmd = ETHTOOL_GLINKSETTINGS;
+    link_settings_header->link_mode_masks_nwords = 0;
+    ifr.ifr_data = (void*)link_settings_header;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get link settings mask sizes", ETHTOOL_GLINKSETTINGS);
+        free(link_settings_header);
+        return;
+    }
+    if (protected_memory_changed((const uint8_t *)link_settings_header,
+                                 sizeof(*link_settings_header))) {
+        FAIL("preserve link_mode_masks", ETHTOOL_GLINKSETTINGS);
+    }
+    if (link_settings_header->link_mode_masks_nwords >= 0) {
+        FAIL("complete handshake", ETHTOOL_GLINKSETTINGS);
+    }
+    link_mode_masks_nwords = -link_settings_header->link_mode_masks_nwords;
+
+    struct ethtool_link_settings *link_settings =
+        (struct ethtool_link_settings *)calloc(
+            1,
+            sizeof(*link_settings) +
+            sizeof(link_settings_header->link_mode_masks[0]) *
+            link_mode_masks_nwords * 3);
+    link_settings->cmd = ETHTOOL_GLINKSETTINGS;
+    link_settings->link_mode_masks_nwords = link_mode_masks_nwords;
+    ifr.ifr_data = (void *)link_settings;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get link settings", ETHTOOL_GLINKSETTINGS);
+        free(link_settings_header);
+        free(link_settings);
+        return;
+    }
+    if (link_settings->link_mode_masks_nwords != link_mode_masks_nwords) {
+        FAIL("have consistent number of mode masks", ETHTOOL_GLINKSETTINGS);
+    }
+
+    printf("Link speed: %d MB\n", link_settings->speed);
+    printf("Number of link mode masks: %d\n",
+           link_settings->link_mode_masks_nwords);
+    if (link_settings->link_mode_masks_nwords > 0) {
+        printf("Supported bitmap:");
+        print_entries(" 0x%08x",
+                      link_settings->link_mode_masks_nwords,
+                      link_settings->link_mode_masks);
+        putchar('\n');
+
+        printf("Advertising bitmap:");
+        print_entries(" 0x%08x",
+                      link_settings->link_mode_masks_nwords,
+                      link_settings->link_mode_masks +
+                      link_settings->link_mode_masks_nwords);
+        putchar('\n');
+
+        printf("Lp advertising bitmap:");
+        print_entries(" 0x%08x",
+                      link_settings->link_mode_masks_nwords,
+                      link_settings->link_mode_masks +
+                      2 * link_settings->link_mode_masks_nwords);
+        putchar('\n');
+    }
+
+    free(link_settings_header);
+    free(link_settings);
+}
+
+/* Test manual implementation of converting `struct ethtool_per_queue_op`. */
+void test_perqueue(int socketfd, struct ifreq ifr)
+{
+    const int n_queue = 2;
+    int i;
+    struct ethtool_per_queue_op *per_queue_op =
+        (struct ethtool_per_queue_op *)calloc(
+            1,
+            sizeof(*per_queue_op) + sizeof(struct ethtool_coalesce) * n_queue);
+    per_queue_op->cmd = ETHTOOL_PERQUEUE;
+    per_queue_op->sub_command = ETHTOOL_GCOALESCE;
+    per_queue_op->queue_mask[0] = 0x3;
+    ifr.ifr_data = (void *)per_queue_op;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+      FAIL("get coalesce per queue", ETHTOOL_PERQUEUE);
+      free(per_queue_op);
+      return;
+    }
+    for (i = 0; i < n_queue; ++i) {
+        struct ethtool_coalesce *coalesce = (struct ethtool_coalesce *)(
+            per_queue_op->data + sizeof(*coalesce) * i);
+        if (coalesce->cmd != ETHTOOL_GCOALESCE) {
+            fprintf(stderr,
+                    "ETHTOOL_PERQUEUE (%d) sub_command ETHTOOL_GCOALESCE (%d) "
+                    "fails to set entry %d's cmd to ETHTOOL_GCOALESCE, got %d "
+                    "instead\n",
+                    ETHTOOL_PERQUEUE, ETHTOOL_GCOALESCE, i,
+                    coalesce->cmd);
+            exit(-1);
+        }
+        printf("rx_coalesce_usecs[%d] = %u\nrx_max_coalesced_frames[%d] = %u\n",
+               i, coalesce->rx_coalesce_usecs,
+               i, coalesce->rx_max_coalesced_frames);
+    }
+
+    free(per_queue_op);
+}
+
+/* Test manual implementation of ETHTOOL_GRSSH. */
+void test_get_rssh(int socketfd, struct ifreq ifr)
+{
+    int i;
+    struct ethtool_rxfh *rxfh_header =
+        (struct ethtool_rxfh *)calloc_protected(sizeof(*rxfh_header));
+    rxfh_header->cmd = ETHTOOL_GRSSH;
+    ifr.ifr_data = (void *)rxfh_header;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get RX flow hash indir and hash key size", ETHTOOL_GRSSH);
+        free(rxfh_header);
+        return;
+    }
+    if (protected_memory_changed((const uint8_t *)rxfh_header,
+                                 sizeof(*rxfh_header))) {
+        FAIL("preserve rss_config", ETHTOOL_GRSSH);
+    }
+    printf("RX flow hash indir size = %d\nRX flow hash key size = %d\n",
+           rxfh_header->indir_size, rxfh_header->key_size);
+
+    struct ethtool_rxfh *rxfh = (struct ethtool_rxfh *)calloc(
+        1,
+        sizeof(*rxfh) + 4 * rxfh_header->indir_size + rxfh_header->key_size);
+    *rxfh = *rxfh_header;
+    ifr.ifr_data = (void*)rxfh;
+    if (ioctl(socketfd, SIOCETHTOOL, &ifr) == -1) {
+        FAIL("get RX flow hash indir and hash key", ETHTOOL_GRSSH);
+        free(rxfh_header);
+        free(rxfh);
+        return;
+    }
+
+    if (rxfh->indir_size == 0) {
+        printf("No RX flow hash indir\n");
+    } else {
+        printf("RX flow hash indir:");
+        print_entries(" 0x%08x", rxfh->indir_size, rxfh->rss_config);
+        putchar('\n');
+    }
+
+    if (rxfh->key_size == 0) {
+        printf("No RX flow hash key\n");
+    } else {
+        char *key = (char *)(rxfh->rss_config + rxfh->indir_size);
+        printf("RX flow hash key:");
+        for (i = 0;  i < rxfh->key_size; ++i) {
+            if (i % 2 == 0)
+                putchar(' ');
+            printf("%02hhx", key[i]);
+        }
+        putchar('\n');
+    }
+    free(rxfh_header);
+    free(rxfh);
+}
+
+int main(int argc, char **argv)
+{
+    int socketfd, n_stats, i;
+    struct ifreq ifr;
+
+    socketfd = socket(AF_INET, SOCK_DGRAM, 0);
+    if (socketfd == -1) {
+        int err = errno;
+        fprintf(stderr,
+                "Failed to open socket: errno = %d: %s\n",
+                err, strerror(err));
+        return err;
+    }
+
+    for (i = 1;; ++i) {
+        ifr.ifr_ifindex = i;
+        if (ioctl(socketfd, SIOCGIFNAME, &ifr) == -1) {
+            puts("Could not find a non-loopback interface, SKIPPING");
+            return 0;
+        }
+        if (strncmp(ifr.ifr_name, "lo", IFNAMSIZ) != 0) {
+            break;
+        }
+    }
+    printf("Interface index: %d\nInterface name: %.*s\n",
+           ifr.ifr_ifindex, IFNAMSIZ, ifr.ifr_name);
+
+    basic_test(socketfd, ifr);
+
+    n_stats = test_get_sset_info(socketfd, ifr);
+    if (n_stats > 0) {
+        /* Testing lexible arrays. */
+        test_get_stats(socketfd, ifr, n_stats);
+        test_get_strings(socketfd, ifr, n_stats);
+    }
+
+    /* Testing manual implementations of structure convertions. */
+    test_get_rxfh(socketfd, ifr);
+    test_get_link_settings(socketfd, ifr);
+    test_perqueue(socketfd, ifr);
+
+    /* Testing manual implementations of operations. */
+    test_get_rssh(socketfd, ifr);
+
+    return 0;
+}
-- 
2.28.0.rc0.105.gf9edc3c819-goog



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/6] fcntl, sockopt, and ioctl options
  2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
                   ` (5 preceding siblings ...)
  2020-07-23  0:19 ` [PATCH 6/6] linux-user: Add support for SIOCETHTOOL ioctl Shu-Chun Weng
@ 2020-08-05 23:22 ` Shu-Chun Weng
  6 siblings, 0 replies; 11+ messages in thread
From: Shu-Chun Weng @ 2020-08-05 23:22 UTC (permalink / raw)
  To: qemu-devel; +Cc: Laurent Vivier


[-- Attachment #1.1: Type: text/plain, Size: 2890 bytes --]

Ping! Patchew: https://patchew.org/QEMU/cover.1595461447.git.scw@google.com/

On Wed, Jul 22, 2020 at 5:19 PM Shu-Chun Weng <scw@google.com> wrote:

> Hi Laurent,
>
> This is a series of 6 patches in 4 groups, putting into a single thread for
> easier tracking.
>
> [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls
>   An incidental follow up on
>   https://lists.nongnu.org/archive/html/qemu-devel/2019-09/msg01925.html
>
> [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt
>   Updated
> https://lists.nongnu.org/archive/html/qemu-devel/2019-09/msg01317.html
>   to consistently add them in get/setsockopt
>
> [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW
> [PATCH 4/6] linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING
>   Updated
> https://lists.nongnu.org/archive/html/qemu-devel/2019-09/msg01319.html
>   to only use TARGET_SO_*_OLD/NEW
>
> [PATCH 5/6] thunk: supports flexible arrays
> [PATCH 6/6] linux-user: Add support for SIOCETHTOOL ioctl
>   Updated
> https://lists.nongnu.org/archive/html/qemu-devel/2019-08/msg05090.html
>
> Shu-Chun Weng (6):
>   linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls
>   linux-user: add missing UDP and IPv6 get/setsockopt options
>   linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW
>   linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING
>   thunk: supports flexible arrays
>   linux-user: Add support for SIOCETHTOOL ioctl
>
>  include/exec/user/thunk.h              |  20 +
>  linux-user/Makefile.objs               |   3 +-
>  linux-user/alpha/sockbits.h            |  21 +-
>  linux-user/ethtool.c                   | 819 +++++++++++++++++++++++++
>  linux-user/ethtool.h                   |  19 +
>  linux-user/ethtool_entries.h           | 107 ++++
>  linux-user/fd-trans.h                  |  41 +-
>  linux-user/generic/sockbits.h          |  17 +-
>  linux-user/hppa/sockbits.h             |  20 +-
>  linux-user/ioctls.h                    |   2 +
>  linux-user/mips/sockbits.h             |  16 +-
>  linux-user/qemu.h                      |   1 +
>  linux-user/sparc/sockbits.h            |  21 +-
>  linux-user/strace.c                    |  19 +-
>  linux-user/syscall.c                   | 233 ++++++-
>  linux-user/syscall_defs.h              |  26 +-
>  linux-user/syscall_types.h             | 277 +++++++++
>  tests/tcg/multiarch/ethtool.c          | 417 +++++++++++++
>  tests/tcg/multiarch/socket_timestamp.c | 542 ++++++++++++++++
>  thunk.c                                | 151 ++++-
>  20 files changed, 2706 insertions(+), 66 deletions(-)
>  create mode 100644 linux-user/ethtool.c
>  create mode 100644 linux-user/ethtool.h
>  create mode 100644 linux-user/ethtool_entries.h
>  create mode 100644 tests/tcg/multiarch/ethtool.c
>  create mode 100644 tests/tcg/multiarch/socket_timestamp.c
>
> --
> 2.28.0.rc0.105.gf9edc3c819-goog
>
>

[-- Attachment #1.2: Type: text/html, Size: 4122 bytes --]

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 3844 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls
  2020-07-23  0:19 ` [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls Shu-Chun Weng
@ 2020-08-07 11:44   ` Laurent Vivier
  0 siblings, 0 replies; 11+ messages in thread
From: Laurent Vivier @ 2020-08-07 11:44 UTC (permalink / raw)
  To: Shu-Chun Weng, qemu-devel

Le 23/07/2020 à 02:19, Shu-Chun Weng a écrit :
> Signed-off-by: Shu-Chun Weng <scw@google.com>
> ---
>  linux-user/syscall.c      | 10 ++++++++++
>  linux-user/syscall_defs.h | 14 ++++++++------
>  2 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 1211e759c2..f97337b0b4 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -6312,6 +6312,14 @@ static int target_to_host_fcntl_cmd(int cmd)
>      case TARGET_F_GETPIPE_SZ:
>          ret = F_GETPIPE_SZ;
>          break;
> +#endif
> +#ifdef F_ADD_SEALS
> +    case TARGET_F_ADD_SEALS:
> +        ret = F_ADD_SEALS;
> +        break;
> +    case TARGET_F_GET_SEALS:
> +        ret = F_GET_SEALS;
> +        break;
>  #endif
>      default:
>          ret = -TARGET_EINVAL;
> @@ -6598,6 +6606,8 @@ static abi_long do_fcntl(int fd, int cmd, abi_ulong arg)
>      case TARGET_F_GETLEASE:
>      case TARGET_F_SETPIPE_SZ:
>      case TARGET_F_GETPIPE_SZ:
> +    case TARGET_F_ADD_SEALS:
> +    case TARGET_F_GET_SEALS:
>          ret = get_errno(safe_fcntl(fd, host_cmd, arg));
>          break;
>  
> diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
> index 3c261cff0e..70df1a94fb 100644
> --- a/linux-user/syscall_defs.h
> +++ b/linux-user/syscall_defs.h
> @@ -2292,12 +2292,14 @@ struct target_statfs64 {
>  #endif
>  
>  #define TARGET_F_LINUX_SPECIFIC_BASE 1024
> -#define TARGET_F_SETLEASE (TARGET_F_LINUX_SPECIFIC_BASE + 0)
> -#define TARGET_F_GETLEASE (TARGET_F_LINUX_SPECIFIC_BASE + 1)
> -#define TARGET_F_DUPFD_CLOEXEC (TARGET_F_LINUX_SPECIFIC_BASE + 6)
> -#define TARGET_F_SETPIPE_SZ (TARGET_F_LINUX_SPECIFIC_BASE + 7)
> -#define TARGET_F_GETPIPE_SZ (TARGET_F_LINUX_SPECIFIC_BASE + 8)
> -#define TARGET_F_NOTIFY  (TARGET_F_LINUX_SPECIFIC_BASE+2)
> +#define TARGET_F_SETLEASE            (TARGET_F_LINUX_SPECIFIC_BASE + 0)
> +#define TARGET_F_GETLEASE            (TARGET_F_LINUX_SPECIFIC_BASE + 1)
> +#define TARGET_F_DUPFD_CLOEXEC       (TARGET_F_LINUX_SPECIFIC_BASE + 6)
> +#define TARGET_F_NOTIFY              (TARGET_F_LINUX_SPECIFIC_BASE + 2)
> +#define TARGET_F_SETPIPE_SZ          (TARGET_F_LINUX_SPECIFIC_BASE + 7)
> +#define TARGET_F_GETPIPE_SZ          (TARGET_F_LINUX_SPECIFIC_BASE + 8)
> +#define TARGET_F_ADD_SEALS           (TARGET_F_LINUX_SPECIFIC_BASE + 9)
> +#define TARGET_F_GET_SEALS           (TARGET_F_LINUX_SPECIFIC_BASE + 10)
>  
>  #include "target_fcntl.h"
>  
> 

Looks good. Please also update print_fcntl() in linux-user/strace.c

Thanks,
Laurent


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt options
  2020-07-23  0:19 ` [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt options Shu-Chun Weng
@ 2020-08-07 11:54   ` Laurent Vivier
  0 siblings, 0 replies; 11+ messages in thread
From: Laurent Vivier @ 2020-08-07 11:54 UTC (permalink / raw)
  To: Shu-Chun Weng; +Cc: qemu-devel

Le 23/07/2020 à 02:19, Shu-Chun Weng a écrit :
> UDP: SOL_UDP manipulate options at UDP level. All six options currently
> defined in linux source include/uapi/linux/udp.h take integer values.
> 
> IPv6: IPV6_ADDR_PREFERENCES (RFC5014: Source address selection) was not
> supported.
> 
> Signed-off-by: Shu-Chun Weng <scw@google.com>
> ---
>  linux-user/syscall.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

This might be really clearer if you split this patch in two: one to add
SOL_UDP and one to add IPV6_ADDR_PREFERENCES.

Also update do_print_sockopt() in linux-use/strace.c

With that done, you can add my "Reviewed-by: Laurent Vivier
<laurent@vivier.eu>" to both.

Thanks,
Laurent



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW
  2020-07-23  0:19 ` [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW Shu-Chun Weng
@ 2020-08-07 12:07   ` Laurent Vivier
  0 siblings, 0 replies; 11+ messages in thread
From: Laurent Vivier @ 2020-08-07 12:07 UTC (permalink / raw)
  To: Shu-Chun Weng, QEMU Developers

Le 23/07/2020 à 02:19, Shu-Chun Weng a écrit :
> Both guest options map to host SO_TIMESTAMP while keeping a bit in
> fd_trans to remember if the guest expects the old or the new format.

I don't think we need to keep this information for each fd.

Once a program has used the _NEW version it will always use the _NEW
version. It's possible to mix, but I don't think we have to support
this. This adds too much complexity.

> Added a multiarch test to verify.
> 
> Signed-off-by: Shu-Chun Weng <scw@google.com>
> ---
>  linux-user/alpha/sockbits.h            |   8 +-
>  linux-user/fd-trans.h                  |  41 +++-
>  linux-user/generic/sockbits.h          |   9 +-
>  linux-user/hppa/sockbits.h             |   8 +-
>  linux-user/mips/sockbits.h             |   8 +-
>  linux-user/sparc/sockbits.h            |   8 +-
>  linux-user/strace.c                    |   7 +-(fd)
>  linux-user/syscall.c                   |  69 ++++--
>  tests/tcg/multiarch/socket_timestamp.c | 292 +++++++++++++++++++++++++
>  9 files changed, 419 insertions(+), 31 deletions(-)
>  create mode 100644 tests/tcg/multiarch/socket_timestamp.c
> 

Thanks,
Laurent


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-08-07 12:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-23  0:19 [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng
2020-07-23  0:19 ` [PATCH 1/6] linux-user: Support F_ADD_SEALS and F_GET_SEALS fcntls Shu-Chun Weng
2020-08-07 11:44   ` Laurent Vivier
2020-07-23  0:19 ` [PATCH 2/6] linux-user: add missing UDP and IPv6 get/setsockopt options Shu-Chun Weng
2020-08-07 11:54   ` Laurent Vivier
2020-07-23  0:19 ` [PATCH 3/6] linux-user: Update SO_TIMESTAMP to SO_TIMESTAMP_OLD/NEW Shu-Chun Weng
2020-08-07 12:07   ` Laurent Vivier
2020-07-23  0:19 ` [PATCH 4/6] linux-user: setsockopt() SO_TIMESTAMPNS and SO_TIMESTAMPING Shu-Chun Weng
2020-07-23  0:19 ` [PATCH 5/6] thunk: supports flexible arrays Shu-Chun Weng
2020-07-23  0:19 ` [PATCH 6/6] linux-user: Add support for SIOCETHTOOL ioctl Shu-Chun Weng
2020-08-05 23:22 ` [PATCH 0/6] fcntl, sockopt, and ioctl options Shu-Chun Weng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.