* [PATCH] lseek.2: SYNOPSIS: Use correct types @ 2020-11-21 17:30 Alejandro Colomar 2020-11-21 17:45 ` Alejandro Colomar (man-pages) ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Alejandro Colomar @ 2020-11-21 17:30 UTC (permalink / raw) To: mtk.manpages; +Cc: Alejandro Colomar, linux-man, linux-kernel The Linux kernel uses 'unsigned int' instead of 'int' for 'fd' and 'whence'. As glibc provides no wrapper, use the same types the kernel uses. src/linux$ grep -rn "SYSCALL_DEFINE.*lseek" fs/read_write.c:322:SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) fs/read_write.c:328:COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned int, whence) fs/read_write.c:336:SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high, arch/mips/kernel/linux32.c:65:SYSCALL_DEFINE5(32_llseek, unsigned int, fd, unsigned int, offset_high, src/linux$ sed -n 322,325p fs/read_write.c SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) { return ksys_lseek(fd, offset, whence); } Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> --- man2/lseek.2 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/man2/lseek.2 b/man2/lseek.2 index e35e410a6..2ff878ffa 100644 --- a/man2/lseek.2 +++ b/man2/lseek.2 @@ -51,7 +51,7 @@ lseek \- reposition read/write file offset .br .B #include <unistd.h> .PP -.BI "off_t lseek(int " fd ", off_t " offset ", int " whence ); +.BI "off_t lseek(unsigned int " fd ", off_t " offset ", unsigned int " whence ); .SH DESCRIPTION .BR lseek () repositions the file offset of the open file description -- 2.29.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] lseek.2: SYNOPSIS: Use correct types 2020-11-21 17:30 [PATCH] lseek.2: SYNOPSIS: Use correct types Alejandro Colomar @ 2020-11-21 17:45 ` Alejandro Colomar (man-pages) 2020-11-22 22:37 ` Michael Kerrisk (man-pages) 2020-11-22 12:43 ` Florian Weimer 2020-11-22 22:32 ` Michael Kerrisk (man-pages) 2 siblings, 1 reply; 6+ messages in thread From: Alejandro Colomar (man-pages) @ 2020-11-21 17:45 UTC (permalink / raw) To: mtk.manpages; +Cc: linux-man, linux-kernel Hi Michael, I'm a bit lost in all the *lseek* pages. You had a good read some months ago, so you may know it better. I don't know which of those functions come from the kernel, and which come from glibc (if any). In the kernel I only found the lseek, llseek, and 32_llseek (as you can see in the patch). So if any other prototype needs to be updated, please do so. Especially, have a look at lseek64(3), which I suspect needs the same changes I propose in that patch. Thanks, Alex On 11/21/20 6:30 PM, Alejandro Colomar wrote: > The Linux kernel uses 'unsigned int' instead of 'int' > for 'fd' and 'whence'. > As glibc provides no wrapper, use the same types the kernel uses. > > src/linux$ grep -rn "SYSCALL_DEFINE.*lseek" > fs/read_write.c:322:SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) > fs/read_write.c:328:COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned int, whence) > fs/read_write.c:336:SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high, > arch/mips/kernel/linux32.c:65:SYSCALL_DEFINE5(32_llseek, unsigned int, fd, unsigned int, offset_high, > > src/linux$ sed -n 322,325p fs/read_write.c > SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) > { > return ksys_lseek(fd, offset, whence); > } > > Signed-off-by: Alejandro Colomar <alx.manpages@gmail.com> > --- > man2/lseek.2 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/man2/lseek.2 b/man2/lseek.2 > index e35e410a6..2ff878ffa 100644 > --- a/man2/lseek.2 > +++ b/man2/lseek.2 > @@ -51,7 +51,7 @@ lseek \- reposition read/write file offset > .br > .B #include <unistd.h> > .PP > -.BI "off_t lseek(int " fd ", off_t " offset ", int " whence ); > +.BI "off_t lseek(unsigned int " fd ", off_t " offset ", unsigned int " whence ); > .SH DESCRIPTION > .BR lseek () > repositions the file offset of the open file description > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] lseek.2: SYNOPSIS: Use correct types 2020-11-21 17:45 ` Alejandro Colomar (man-pages) @ 2020-11-22 22:37 ` Michael Kerrisk (man-pages) 0 siblings, 0 replies; 6+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-11-22 22:37 UTC (permalink / raw) To: Alejandro Colomar (man-pages); +Cc: linux-man, lkml, libc-alpha, Florian Weimer Hi Alex, On Sat, 21 Nov 2020 at 18:45, Alejandro Colomar (man-pages) <alx.manpages@gmail.com> wrote: > > Hi Michael, > > I'm a bit lost in all the *lseek* pages. > > You had a good read some months ago, so you may know it better. > I don't know which of those functions come from the kernel, > and which come from glibc (if any). It always takes me too long to remind myself of the details here :-(. This time, I'll try to write what I (re)learned. Inside the kernel (5.9 sources), in fs/read_write.c, we have: [[ SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence) { return ksys_lseek(fd, offset, whence); } #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned int, whence) { return ksys_lseek(fd, offset, whence); } #endif #if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) || \ defined(__ARCH_WANT_SYS_LLSEEK) SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high, unsigned long, offset_low, loff_t __user *, result, unsigned int, whence) { ... } #endif ]] The main pieces of interest here are the first and last SYSCALL_DEFINEn. The first is the "standard" lseek() system call that exists on 64-bit and 32-bit architectures. The problem on 32-bit architectures is that the off_t type is a 32-bit type, but files can be bigger than 2GB (2**32-1). That's why 32-bit kernels also provide the llseek() system call. It receives the new offset in two 32-bit pieces (offset_high, offset_low), and returns the new offset via a 64-bit off_t argument (result). (I forget the reason why there are 32-bit and 64-bit "offset" args in the syscall.) One more thing... In arch/x86/entry/syscalls/syscall_32.tbl, we see the following line: [[ 140 i386 _llseek sys_llseek ]] This is essentially telling us that 'sys_llseek' (the name generated by SYSCALL_DEFINE5(llseek...)) is exposed to user-space as system call number 140, and that system call number will (IIUC) be exposed in autogenerated headers with the name "__NR__llseek" (i.e., "_llseek"). The "i386" is telling us that this happens in i386 (32-bit Intel). There is nothing equivalent on x86-64, because 64 bit systems don't need an _llseek system call. Now, in ancient times (let's say Linux 2.2), there was a more transparent situation (but the effect was the same): #define __NR__llseek 140 and that system call number was tied to the implementation by this definition linux-2.2.26/arch/i386/kernel/entry.S: .long SYMBOL_NAME(sys_llseek) /* 140 */ == lseek64() is a C library function. It takes and returns a 64-bit offset. It exists to support seeking in large (>2GB) files. Its implementation is in the glibc source file sysdeps/unix/sysv/linux/lseek64.c, where it calls _llseek(2) Returning to the <unistd.h> header file, we have: [[ #ifndef __USE_FILE_OFFSET64 extern __off_t lseek (int __fd, __off_t __offset, int __whence) __THROW; #else # ifdef __REDIRECT_NTH extern __off64_t __REDIRECT_NTH (lseek, (int __fd, __off64_t __offset, int __whence), lseek64); # else # define lseek lseek64 # endif #endif #ifdef __USE_LARGEFILE64 extern __off64_t lseek64 (int __fd, __off64_t __offset, int __whence) __THROW; #endif ]] The name "lseek64" is exposed if _LARGEFILE64_SOURCE (which triggers __USE_LARGEFILE64) is defined. That name was part of the so-called Transitional Large FIle Systems (LFS) API (see page 105 in my book), which existed to support the use of 64-bit file offsets on 32 bit systems. It provided a set of interfaces with names of the form "xxxxx64()" (e.g., "lseek64")) which provided for 64-bit offsets; those names coexisted with the traditional 32-bit APIs (e.g., "lseek"). Alternatively, the LFS specified a macro, _FILE_OFFSET_BITS=64 (which triggers __USE_FILE_OFFSET64) as another way of exposing 64-bit-offset functionality on 32 bit systems. In this case, the traditional API names (e.g., "lseek") are redirected to the 64-bit implementations (e.g., "lseek64"); > In the kernel I only found the lseek, llseek, and 32_llseek I'd ignore 32_llseek -- I guess that's an arch-specific equivalent of _llseek/llseek. > (as you can see in the patch). > So if any other prototype needs to be updated, please do so. > Especially, have a look at lseek64(3), > which I suspect needs the same changes I propose in that patch. I think that no changes to the types are needed in lseek64(3). But maybe some of the info in this mail should be captured in that manual page. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] lseek.2: SYNOPSIS: Use correct types 2020-11-21 17:30 [PATCH] lseek.2: SYNOPSIS: Use correct types Alejandro Colomar 2020-11-21 17:45 ` Alejandro Colomar (man-pages) @ 2020-11-22 12:43 ` Florian Weimer 2020-11-22 13:14 ` Alejandro Colomar (man-pages) 2020-11-22 22:32 ` Michael Kerrisk (man-pages) 2 siblings, 1 reply; 6+ messages in thread From: Florian Weimer @ 2020-11-22 12:43 UTC (permalink / raw) To: Alejandro Colomar; +Cc: mtk.manpages, linux-man, linux-kernel * Alejandro Colomar: > The Linux kernel uses 'unsigned int' instead of 'int' for 'fd' and > 'whence'. As glibc provides no wrapper, use the same types the > kernel uses. lseek is a POSIX interface, and glibc provides it. POSIX uses int for file descriptors (and the whence parameter in case of lseek). The llseek system call is a different matter, that's indeed Linux-specific. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] lseek.2: SYNOPSIS: Use correct types 2020-11-22 12:43 ` Florian Weimer @ 2020-11-22 13:14 ` Alejandro Colomar (man-pages) 0 siblings, 0 replies; 6+ messages in thread From: Alejandro Colomar (man-pages) @ 2020-11-22 13:14 UTC (permalink / raw) To: Florian Weimer; +Cc: mtk.manpages, linux-man, linux-kernel Hi Florian, On 11/22/20 1:43 PM, Florian Weimer wrote: > * Alejandro Colomar: > >> The Linux kernel uses 'unsigned int' instead of 'int' for 'fd' and >> 'whence'. As glibc provides no wrapper, use the same types the >> kernel uses. > > lseek is a POSIX interface, and glibc provides it. POSIX uses int for > file descriptors (and the whence parameter in case of lseek). > > The llseek system call is a different matter, that's indeed > Linux-specific. > Ahhh, true. So many similar functions... :p Thanks, Alex ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] lseek.2: SYNOPSIS: Use correct types 2020-11-21 17:30 [PATCH] lseek.2: SYNOPSIS: Use correct types Alejandro Colomar 2020-11-21 17:45 ` Alejandro Colomar (man-pages) 2020-11-22 12:43 ` Florian Weimer @ 2020-11-22 22:32 ` Michael Kerrisk (man-pages) 2 siblings, 0 replies; 6+ messages in thread From: Michael Kerrisk (man-pages) @ 2020-11-22 22:32 UTC (permalink / raw) To: Alejandro Colomar; +Cc: linux-man, lkml, libc-alpha, Florian Weimer [Adding libc-alpha@ here, so someone might correct me if I make a misstep] Hello Alex, On Sat, 21 Nov 2020 at 18:34, Alejandro Colomar <alx.manpages@gmail.com> wrote: > > The Linux kernel uses 'unsigned int' instead of 'int' > for 'fd' and 'whence'. > As glibc provides no wrapper, use the same types the kernel uses. I see Florian already replied, but just to add a detail or two... In general, the manual pages explicitly note the APIs that have no glibc wrapper. (If not, that's a bug in the page, but I don't expect there are many such bugs.) Looking in <unistd.h>, we have: [[ #ifndef __USE_FILE_OFFSET64 extern __off_t lseek (int __fd, __off_t __offset, int __whence) __THROW; #else # ifdef __REDIRECT_NTH extern __off64_t __REDIRECT_NTH (lseek, (int __fd, __off64_t __offset, int __whence), lseek64); # else # define lseek lseek64 # endif #endif #ifdef __USE_LARGEFILE64 extern __off64_t lseek64 (int __fd, __off64_t __offset, int __whence) __THROW; #endif ]] It looks to me like there's a prototype hiding in there. (And yes, I don't find it so funny to decode the macro logic either.) Thanks, Michael PS By the way, be aware that the code of many wrapper functions is autogenerated from "syscalls.list" files in the glibc source, for example, sysdeps/unix/sysv/linux/syscalls.list. This isn't the case for lseek(), though, as far as I can see; I think the wrapper function is defined in sysdeps/unix/sysv/linux/lseek.c. -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-11-22 22:38 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-21 17:30 [PATCH] lseek.2: SYNOPSIS: Use correct types Alejandro Colomar 2020-11-21 17:45 ` Alejandro Colomar (man-pages) 2020-11-22 22:37 ` Michael Kerrisk (man-pages) 2020-11-22 12:43 ` Florian Weimer 2020-11-22 13:14 ` Alejandro Colomar (man-pages) 2020-11-22 22:32 ` Michael Kerrisk (man-pages)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).