All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 60807] New: not all the pages are encoded using utf-8
@ 2013-08-28 13:38 bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
  0 siblings, 1 reply; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2013-08-28 13:38 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

            Bug ID: 60807
           Summary: not all the pages are encoded using utf-8
           Product: Documentation
           Version: unspecified
          Hardware: All
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P1
         Component: man-pages
          Assignee: documentation_man-pages-ztI5WcYan/vQLgFONoPN62D2FQJk+8+b@public.gmane.org
          Reporter: cs.wzpan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
        Regression: No

I found that not all the pages are encoded using utf-8. It may cause problems
once we try to parse them.

These files are:

* man3/fflush.3
* man3/toupper.3
* man3/updwtmp.3
* man3/encrypt.3
* man3/lockf.3
* man3/rand.3
* man3/fclose.3
* man3/strtok.3
* man2/close.2
* man2/getdomainname.2
* man2/madvise.2
* man2/umask.2
* man2/sysinfo.2
* man2/getrlimit.2
* man5/utmp.5
* man7/cp1251.7
* man7/iso_8859-2.7
* man7/armscii-8.7
* man7/suffixes.7
* man7/iso_8859-4.7
* man7/iso_8859-8.7
* man7/iso_8859-16.7
* man7/hier.7
* man7/iso_8859-13.7
* man7/koi8-u.7
* man7/environ.7
* man7/iso_8859-15.7
* man7/iso_8859-9.7
* man7/iso_8859-11.7
* man7/iso_8859-14.7
* man7/iso_8859-10.7
* man7/iso_8859-6.7
* man7/iso_8859-1.7
* man7/iso_8859-7.7
* man7/koi8-r.7
* man7/iso_8859-5.7
* man7/iso_8859-3.7

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
@ 2013-12-05 17:43   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2013-12-05 17:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2013-12-05 17:43 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

--- Comment #1 from Peter Schiffer <pschiffe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ---
Created attachment 117641
  --> https://bugzilla.kernel.org/attachment.cgi?id=117641&action=edit
print_encoding.sh

Script which can find man pages not in us-ascii.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
  2013-12-05 17:43   ` [Bug 60807] " bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2013-12-05 17:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2013-12-05 17:46   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
                     ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2013-12-05 17:44 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

--- Comment #2 from Peter Schiffer <pschiffe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ---
Created attachment 117651
  --> https://bugzilla.kernel.org/attachment.cgi?id=117651&action=edit
convert_to_utf_8.sh

Script which can convert non us-ascii man pages to utf-8.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
  2013-12-05 17:43   ` [Bug 60807] " bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2013-12-05 17:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2013-12-05 17:46   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2014-02-14 10:22   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2013-12-05 17:46 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

--- Comment #3 from Peter Schiffer <pschiffe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ---
$ ./print_encoding.sh man?/*

   Man Page               Encoding by file   Encoding by first line

 * man2/close.2           iso-8859-1         
 * man2/getdomainname.2   iso-8859-1         
 * man2/getrlimit.2       iso-8859-1         
 * man2/madvise.2         iso-8859-1         
 * man2/mount.2           utf-8              
 * man2/sysinfo.2         iso-8859-1         
 * man2/umask.2           iso-8859-1         
 * man3/encrypt.3         iso-8859-1         
 * man3/fclose.3          iso-8859-1         
 * man3/fflush.3          iso-8859-1         
 * man3/lockf.3           iso-8859-1         
 * man3/rand.3            iso-8859-1         
 * man3/strtok.3          iso-8859-1         
 * man3/toupper.3         iso-8859-1         
 * man3/updwtmp.3         iso-8859-1         
 * man4/st.4              utf-8              
 * man5/utmp.5            iso-8859-1         
 * man7/armscii-8.7       iso-8859-1         ARMSCII-8
 * man7/cp1251.7          unknown-8bit       CP1251
 * man7/environ.7         iso-8859-1         
 * man7/hier.7            iso-8859-1         
 * man7/iso_8859-10.7     iso-8859-1         ISO-8859-10
 * man7/iso_8859-11.7     iso-8859-1         ISO-8859-11
 * man7/iso_8859-13.7     iso-8859-1         ISO-8859-7
 * man7/iso_8859-14.7     iso-8859-1         ISO-8859-14
 * man7/iso_8859-15.7     iso-8859-1         ISO-8859-15
 * man7/iso_8859-16.7     iso-8859-1         ISO-8859-16
 * man7/iso_8859-1.7      iso-8859-1         
 * man7/iso_8859-2.7      iso-8859-1         ISO-8859-2
 * man7/iso_8859-3.7      iso-8859-1         ISO-8859-3
 * man7/iso_8859-4.7      iso-8859-1         ISO-8859-4
 * man7/iso_8859-5.7      iso-8859-1         ISO-8859-5
 * man7/iso_8859-6.7      iso-8859-1         ISO-8859-6
 * man7/iso_8859-7.7      iso-8859-1         ISO-8859-7
 * man7/iso_8859-8.7      iso-8859-1         ISO-8859-8
 * man7/iso_8859-9.7      iso-8859-1         ISO-8859-9
 * man7/koi8-r.7          unknown-8bit       KOI8-R
 * man7/koi8-u.7          unknown-8bit       
 * man7/suffixes.7        iso-8859-1         

$ ./convert_to_utf_8.sh tmp_encoded man?/*
Converting man2/close.2            from iso-8859-1
Converting man2/getdomainname.2    from iso-8859-1
Converting man2/getrlimit.2        from iso-8859-1
Converting man2/madvise.2          from iso-8859-1
Converting man2/mount.2            from utf-8
Converting man2/sysinfo.2          from iso-8859-1
Converting man2/umask.2            from iso-8859-1
Converting man3/encrypt.3          from iso-8859-1
Converting man3/fclose.3           from iso-8859-1
Converting man3/fflush.3           from iso-8859-1
Converting man3/lockf.3            from iso-8859-1
Converting man3/rand.3             from iso-8859-1
Converting man3/strtok.3           from iso-8859-1
Converting man3/toupper.3          from iso-8859-1
Converting man3/updwtmp.3          from iso-8859-1
Converting man4/st.4               from utf-8
Converting man5/utmp.5             from iso-8859-1
Converting man7/armscii-8.7        from armscii-8
Converting man7/cp1251.7           from cp1251
Converting man7/environ.7          from iso-8859-1
Converting man7/hier.7             from iso-8859-1
Converting man7/iso_8859-10.7      from iso_8859-10
Converting man7/iso_8859-11.7      from iso-8859-1
Converting man7/iso_8859-13.7      from iso-8859-1
Converting man7/iso_8859-14.7      from iso_8859-14
Converting man7/iso_8859-15.7      from iso_8859-15
Converting man7/iso_8859-16.7      from iso_8859-16
Converting man7/iso_8859-1.7       from iso_8859-1
Converting man7/iso_8859-2.7       from iso_8859-2
Converting man7/iso_8859-3.7       from iso_8859-3
Converting man7/iso_8859-4.7       from iso_8859-4
Converting man7/iso_8859-5.7       from iso_8859-5
Converting man7/iso_8859-6.7       from iso_8859-6
Converting man7/iso_8859-7.7       from iso_8859-7
Converting man7/iso_8859-8.7       from iso_8859-8
Converting man7/iso_8859-9.7       from iso_8859-9
Converting man7/koi8-r.7           from koi8-r
Converting man7/koi8-u.7           from koi8-u
Converting man7/suffixes.7         from iso-8859-1

$ cd tmp_encoded/

$ ../print_encoding.sh man?/*

   Man Page               Encoding by file   Encoding by first line

 * man2/close.2           utf-8              UTF-8
 * man2/getdomainname.2   utf-8              UTF-8
 * man2/getrlimit.2       utf-8              UTF-8
 * man2/madvise.2         utf-8              UTF-8
 * man2/mount.2           utf-8              UTF-8
 * man2/sysinfo.2         utf-8              UTF-8
 * man2/umask.2           utf-8              UTF-8
 * man3/encrypt.3         utf-8              UTF-8
 * man3/fclose.3          utf-8              UTF-8
 * man3/fflush.3          utf-8              UTF-8
 * man3/lockf.3           utf-8              UTF-8
 * man3/rand.3            utf-8              UTF-8
 * man3/strtok.3          utf-8              UTF-8
 * man3/toupper.3         utf-8              UTF-8
 * man3/updwtmp.3         utf-8              UTF-8
 * man4/st.4              utf-8              UTF-8
 * man5/utmp.5            utf-8              UTF-8
 * man7/armscii-8.7       utf-8              UTF-8
 * man7/cp1251.7          utf-8              UTF-8
 * man7/environ.7         utf-8              UTF-8
 * man7/hier.7            utf-8              UTF-8
 * man7/iso_8859-10.7     utf-8              UTF-8
 * man7/iso_8859-11.7     utf-8              UTF-8
 * man7/iso_8859-13.7     utf-8              UTF-8
 * man7/iso_8859-14.7     utf-8              UTF-8
 * man7/iso_8859-15.7     utf-8              UTF-8
 * man7/iso_8859-16.7     utf-8              UTF-8
 * man7/iso_8859-1.7      utf-8              UTF-8
 * man7/iso_8859-2.7      utf-8              UTF-8
 * man7/iso_8859-3.7      utf-8              UTF-8
 * man7/iso_8859-4.7      utf-8              UTF-8
 * man7/iso_8859-5.7      utf-8              UTF-8
 * man7/iso_8859-6.7      utf-8              UTF-8
 * man7/iso_8859-7.7      utf-8              UTF-8
 * man7/iso_8859-8.7      utf-8              UTF-8
 * man7/iso_8859-9.7      utf-8              UTF-8
 * man7/koi8-r.7          utf-8              UTF-8
 * man7/koi8-u.7          utf-8              UTF-8
 * man7/suffixes.7        utf-8              UTF-8

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
                     ` (2 preceding siblings ...)
  2013-12-05 17:46   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2014-02-14 10:22   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2014-02-14 12:47   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2014-02-14 10:22 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org

--- Comment #4 from Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> ---
(In reply to Peter Schiffer from comment #3)
> $ ./print_encoding.sh man?/*
> 
>    Man Page               Encoding by file   Encoding by first line
> 
>  * man2/close.2           iso-8859-1         
>  * man2/getdomainname.2   iso-8859-1         
>  * man2/getrlimit.2       iso-8859-1         
>  * man2/madvise.2         iso-8859-1         
>  * man2/mount.2           utf-8              
>  * man2/sysinfo.2         iso-8859-1         
>  * man2/umask.2           iso-8859-1         
>  * man3/encrypt.3         iso-8859-1         
>  * man3/fclose.3          iso-8859-1         
>  * man3/fflush.3          iso-8859-1         
>  * man3/lockf.3           iso-8859-1         
>  * man3/rand.3            iso-8859-1         
>  * man3/strtok.3          iso-8859-1         
>  * man3/toupper.3         iso-8859-1         
>  * man3/updwtmp.3         iso-8859-1         
>  * man4/st.4              utf-8              
>  * man5/utmp.5            iso-8859-1         
>  * man7/armscii-8.7       iso-8859-1         ARMSCII-8
>  * man7/cp1251.7          unknown-8bit       CP1251
>  * man7/environ.7         iso-8859-1         
>  * man7/hier.7            iso-8859-1         
>  * man7/iso_8859-10.7     iso-8859-1         ISO-8859-10
>  * man7/iso_8859-11.7     iso-8859-1         ISO-8859-11
>  * man7/iso_8859-13.7     iso-8859-1         ISO-8859-7
>  * man7/iso_8859-14.7     iso-8859-1         ISO-8859-14
>  * man7/iso_8859-15.7     iso-8859-1         ISO-8859-15
>  * man7/iso_8859-16.7     iso-8859-1         ISO-8859-16
>  * man7/iso_8859-1.7      iso-8859-1         
>  * man7/iso_8859-2.7      iso-8859-1         ISO-8859-2
>  * man7/iso_8859-3.7      iso-8859-1         ISO-8859-3
>  * man7/iso_8859-4.7      iso-8859-1         ISO-8859-4
>  * man7/iso_8859-5.7      iso-8859-1         ISO-8859-5
>  * man7/iso_8859-6.7      iso-8859-1         ISO-8859-6
>  * man7/iso_8859-7.7      iso-8859-1         ISO-8859-7
>  * man7/iso_8859-8.7      iso-8859-1         ISO-8859-8
>  * man7/iso_8859-9.7      iso-8859-1         ISO-8859-9
>  * man7/koi8-r.7          unknown-8bit       KOI8-R
>  * man7/koi8-u.7          unknown-8bit       
>  * man7/suffixes.7        iso-8859-1         
> 
> $ ./convert_to_utf_8.sh tmp_encoded man?/*
> Converting man2/close.2            from iso-8859-1
> Converting man2/getdomainname.2    from iso-8859-1
> Converting man2/getrlimit.2        from iso-8859-1
> Converting man2/madvise.2          from iso-8859-1
> Converting man2/mount.2            from utf-8
> Converting man2/sysinfo.2          from iso-8859-1
> Converting man2/umask.2            from iso-8859-1
> Converting man3/encrypt.3          from iso-8859-1
> Converting man3/fclose.3           from iso-8859-1
> Converting man3/fflush.3           from iso-8859-1
> Converting man3/lockf.3            from iso-8859-1
> Converting man3/rand.3             from iso-8859-1
> Converting man3/strtok.3           from iso-8859-1
> Converting man3/toupper.3          from iso-8859-1
> Converting man3/updwtmp.3          from iso-8859-1
> Converting man4/st.4               from utf-8
> Converting man5/utmp.5             from iso-8859-1
> Converting man7/armscii-8.7        from armscii-8
> Converting man7/cp1251.7           from cp1251
> Converting man7/environ.7          from iso-8859-1
> Converting man7/hier.7             from iso-8859-1
> Converting man7/iso_8859-10.7      from iso_8859-10
> Converting man7/iso_8859-11.7      from iso-8859-1
> Converting man7/iso_8859-13.7      from iso-8859-1
> Converting man7/iso_8859-14.7      from iso_8859-14
> Converting man7/iso_8859-15.7      from iso_8859-15
> Converting man7/iso_8859-16.7      from iso_8859-16
> Converting man7/iso_8859-1.7       from iso_8859-1
> Converting man7/iso_8859-2.7       from iso_8859-2
> Converting man7/iso_8859-3.7       from iso_8859-3
> Converting man7/iso_8859-4.7       from iso_8859-4
> Converting man7/iso_8859-5.7       from iso_8859-5
> Converting man7/iso_8859-6.7       from iso_8859-6
> Converting man7/iso_8859-7.7       from iso_8859-7
> Converting man7/iso_8859-8.7       from iso_8859-8
> Converting man7/iso_8859-9.7       from iso_8859-9
> Converting man7/koi8-r.7           from koi8-r
> Converting man7/koi8-u.7           from koi8-u
> Converting man7/suffixes.7         from iso-8859-1
> 
> $ cd tmp_encoded/
> 
> $ ../print_encoding.sh man?/*
> 
>    Man Page               Encoding by file   Encoding by first line
> 
>  * man2/close.2           utf-8              UTF-8
>  * man2/getdomainname.2   utf-8              UTF-8
>  * man2/getrlimit.2       utf-8              UTF-8
>  * man2/madvise.2         utf-8              UTF-8
>  * man2/mount.2           utf-8              UTF-8
>  * man2/sysinfo.2         utf-8              UTF-8
>  * man2/umask.2           utf-8              UTF-8
>  * man3/encrypt.3         utf-8              UTF-8
>  * man3/fclose.3          utf-8              UTF-8
>  * man3/fflush.3          utf-8              UTF-8
>  * man3/lockf.3           utf-8              UTF-8
>  * man3/rand.3            utf-8              UTF-8
>  * man3/strtok.3          utf-8              UTF-8
>  * man3/toupper.3         utf-8              UTF-8
>  * man3/updwtmp.3         utf-8              UTF-8
>  * man4/st.4              utf-8              UTF-8
>  * man5/utmp.5            utf-8              UTF-8
>  * man7/armscii-8.7       utf-8              UTF-8
>  * man7/cp1251.7          utf-8              UTF-8
>  * man7/environ.7         utf-8              UTF-8
>  * man7/hier.7            utf-8              UTF-8
>  * man7/iso_8859-10.7     utf-8              UTF-8
>  * man7/iso_8859-11.7     utf-8              UTF-8
>  * man7/iso_8859-13.7     utf-8              UTF-8
>  * man7/iso_8859-14.7     utf-8              UTF-8
>  * man7/iso_8859-15.7     utf-8              UTF-8
>  * man7/iso_8859-16.7     utf-8              UTF-8
>  * man7/iso_8859-1.7      utf-8              UTF-8
>  * man7/iso_8859-2.7      utf-8              UTF-8
>  * man7/iso_8859-3.7      utf-8              UTF-8
>  * man7/iso_8859-4.7      utf-8              UTF-8
>  * man7/iso_8859-5.7      utf-8              UTF-8
>  * man7/iso_8859-6.7      utf-8              UTF-8
>  * man7/iso_8859-7.7      utf-8              UTF-8
>  * man7/iso_8859-8.7      utf-8              UTF-8
>  * man7/iso_8859-9.7      utf-8              UTF-8
>  * man7/koi8-r.7          utf-8              UTF-8
>  * man7/koi8-u.7          utf-8              UTF-8
>  * man7/suffixes.7        utf-8              UTF-8

Peter,

Sorry to be slow following up on this. Thanks for the scripts.

As some background, I'll just note that the current encoding markers in the
iso_8859* pages were added in response to this 2009 bug report:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=519209

It seems a reasonable idea to convert everything to UTF-8, but I have some
concerns/questions.

1. Is the encoding line: 
'\" t -*- coding: UTF-8 -*-
really needed, or does modern groff just work this out?

2. I'm concerned about backward compatibility issues. As in: what if someone
loads the man pages onto a system with old groff. Now, as far as I can work
out, groff added input unicode support in v1.20, 2009
(http://lists.gnu.org/archive/html/groff/2009-01/msg00011.html). So, perhaps
that's long enough ago that we don't need to worry too much about these issues.

Any thoughts?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
                     ` (3 preceding siblings ...)
  2014-02-14 10:22   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2014-02-14 12:47   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2014-02-16  6:34   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2014-02-14 12:47 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

--- Comment #5 from Peter Schiffer <pschiffe-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ---
Hi Michael,

1. It looks like it works without the encoding line, but as Colin said in the
email, it's better with it.

2. Also, greatly answered by Colin, I'll just add that we are converting
man-pages to utf-8 since before the first RHEL-6 was released, when Fedora
wasn't using the man-db but man 1.6. In general, we convert almost everything
to the utf-8 what is not and has special characters, otherwise it's usually
troubles..

peter

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
                     ` (4 preceding siblings ...)
  2014-02-14 12:47   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2014-02-16  6:34   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2014-02-16  7:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2014-02-18 15:42   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2014-02-16  6:34 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

--- Comment #6 from Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> ---
For reference, the discussion thread on linx-man@

http://thread.gmane.org/gmane.linux.man/5069
Subject: Converting man-pages to UTF-8
Date: 2014-02-14 10:43:30 UTC

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
                     ` (5 preceding siblings ...)
  2014-02-16  6:34   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2014-02-16  7:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  2014-02-18 15:42   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2014-02-16  7:44 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |CODE_FIX

--- Comment #7 from Michael Kerrisk <mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> ---
(In reply to Peter Schiffer from comment #5)
> Hi Michael,
> 
> 1. It looks like it works without the encoding line, but as Colin said in
> the email, it's better with it.
> 
> 2. Also, greatly answered by Colin, I'll just add that we are converting
> man-pages to utf-8 since before the first RHEL-6 was released, when Fedora
> wasn't using the man-db but man 1.6. In general, we convert almost
> everything to the utf-8 what is not and has special characters, otherwise
> it's usually troubles..

Pete,

I've applied your scipts, with the slight (manual) tweak that the "coding:"
line is added only to pages with UTF-8 in the source.

I've also checked your scripts into the scripts/ directory in the man-pages Git
repo. Maybe they'll come in useful someday or for someone else. Thanks for
doing the legwork on this issue.

Cheers,

Michael

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug 60807] not all the pages are encoded using utf-8
       [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
                     ` (6 preceding siblings ...)
  2014-02-16  7:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
@ 2014-02-18 15:42   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
  7 siblings, 0 replies; 9+ messages in thread
From: bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r @ 2014-02-18 15:42 UTC (permalink / raw)
  To: linux-man-u79uwXL29TY76Z2rM5mHXA

https://bugzilla.kernel.org/show_bug.cgi?id=60807

--- Comment #8 from Weizhou Pan <cs.wzpan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> ---
Looks awesome. Thanks for your works!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-02-18 15:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-28 13:38 [Bug 60807] New: not all the pages are encoded using utf-8 bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
     [not found] ` <bug-60807-11311-3bo0kxnWaOQUvHkbgXJLS5sdmw4N0Rt+2LY78lusg7I@public.gmane.org/>
2013-12-05 17:43   ` [Bug 60807] " bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2013-12-05 17:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2013-12-05 17:46   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-14 10:22   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-14 12:47   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-16  6:34   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-16  7:44   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r
2014-02-18 15:42   ` bugzilla-daemon-590EEB7GvNiWaY/ihj7yzEB+6BGkLq7r

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.