From mboxrd@z Thu Jan 1 00:00:00 1970 From: Torsten =?utf-8?q?B=C3=B6gershausen?= Subject: [PATCH] Unicode: update of combining code points Date: Mon, 7 Apr 2014 21:38:33 +0200 Message-ID: <201404072138.34379.tboegi@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: tboegi@web.de To: git@vger.kernel.org X-From: git-owner@vger.kernel.org Mon Apr 07 21:40:46 2014 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1WXFPR-0001vS-Tt for gcvg-git-2@plane.gmane.org; Mon, 07 Apr 2014 21:40:46 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755316AbaDGTkm convert rfc822-to-quoted-printable (ORCPT ); Mon, 7 Apr 2014 15:40:42 -0400 Received: from mout.web.de ([212.227.17.11]:54032 "EHLO mout.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755201AbaDGTkl convert rfc822-to-8bit (ORCPT ); Mon, 7 Apr 2014 15:40:41 -0400 Received: from appes.localnet ([78.72.74.102]) by smtp.web.de (mrweb001) with ESMTPSA (Nemesis) id 0M6mgu-1Wu7RT3f1A-00wYnW; Mon, 07 Apr 2014 21:40:40 +0200 X-Provags-ID: V03:K0:hKqpdH+5dbr+TlVgE6pPh1TQeHUj+tbqwf/fbQMDrkQc9d14ZHC P8HYri90laH5IqYqku5t9YUhwVl49eHT4HvCseKU62pUyKSdOtbmfIWH1t3YXgJSg1NVA9L 1VCKiMDgbr+htiKO16P6rQsRueTr9f6DX+QZjkKiy9JaysYXN4QQJRAWbg1A7iqibqjGXnj UACcdrMALC87+v7Vj841w== Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: Unicode 6.3 defines the following code as combining or accents, git_wcwidth() should return 0. Earlier unicode standards had defined these code point as "reserved": 358 COMBINING DOT ABOVE RIGHT 359 COMBINING ASTERISK BELOW 35A COMBINING DOUBLE RING BELOW 35B COMBINING ZIGZAG ABOVE 35C COMBINING DOUBLE BREVE BELOW 487 COMBINING CYRILLIC POKRYTIE 5A2 HEBREW ACCENT ATNAH HAFUKH, 5BA HEBREW POINT HOLAM HASER FOR VAV 5C5 HEBREW MARK LOWER DOT 5C7 HEBREW POINT QAMATS QATAN 604 ARABIC SIGN SAMVAT 616 ARABIC SMALL HIGH LIGATURE ALEF WITH LAM WITH YEH 617 ARABIC SMALL HIGH ZAIN 618 ARABIC SMALL FATHA 619 ARABIC SMALL DAMMA 61A ARABIC SMALL KASRA 659 ARABIC ZWARAKAY 65A ARABIC VOWEL SIGN SMALL V ABOVE 65B ARABIC VOWEL SIGN INVERTED SMALL V ABOVE 65C ARABIC VOWEL SIGN DOT BELOW 65D ARABIC REVERSED DAMMA 65E ARABIC FATHA WITH TWO DOTS 65F ARABIC WAVY HAMZA BELOW This commit touches only the range 300-6FF, there may be more to be upd= ated. Signed-off-by: Torsten B=C3=B6gershausen --- utf8.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/utf8.c b/utf8.c index a831d50..77c28d4 100644 --- a/utf8.c +++ b/utf8.c @@ -84,11 +84,10 @@ static int git_wcwidth(ucs_char_t ch) * "uniset +cat=3DMe +cat=3DMn +cat=3DCf -00AD +1160-11FF +200B c". */ static const struct interval combining[] =3D { - { 0x0300, 0x0357 }, { 0x035D, 0x036F }, { 0x0483, 0x0486 }, - { 0x0488, 0x0489 }, { 0x0591, 0x05A1 }, { 0x05A3, 0x05B9 }, - { 0x05BB, 0x05BD }, { 0x05BF, 0x05BF }, { 0x05C1, 0x05C2 }, - { 0x05C4, 0x05C4 }, { 0x0600, 0x0603 }, { 0x0610, 0x0615 }, - { 0x064B, 0x0658 }, { 0x0670, 0x0670 }, { 0x06D6, 0x06E4 }, + { 0x0300, 0x036F }, { 0x0483, 0x0489 }, { 0x0591, 0x05BD }, + { 0x05BF, 0x05BF }, { 0x05C1, 0x05C2 }, { 0x05C4, 0x05C5 }, + { 0x05C7, 0x05C7 }, { 0x0600, 0x0604 }, { 0x0610, 0x061A }, + { 0x064B, 0x065F }, { 0x0670, 0x0670 }, { 0x06D6, 0x06E4 }, { 0x06E7, 0x06E8 }, { 0x06EA, 0x06ED }, { 0x070F, 0x070F }, { 0x0711, 0x0711 }, { 0x0730, 0x074A }, { 0x07A6, 0x07B0 }, { 0x0901, 0x0902 }, { 0x093C, 0x093C }, { 0x0941, 0x0948 }, --=20 1.9.0