Specifications | Code page information

↑ Up one level

C-H 3-3220-020: IBM Simplified Chinese Graphic Character Set, GBK or GB 18030 Code

Planes defined
CPGID 837, CPGID 1385, CPGID 1391
Repertoires defined
GCSGID 1084, GCSGID 1085, GCSGID 2081, GCSGID 2084, GCSGID 2085, GCSGID 2086, GCSGID 2087, GCSGID 2102, GCSGID 2103

Note that the core part of CPGID 837 was, and still is, charted in C-H 3-3220-130, not here (and significant portions of CPGID 1385 are referenced to specific tables in that document, even though CPGID 1385 is not entirely a superset of CPGID 1380); this document is essentially a supplement to C-H 3-3220-130. Even for the additions, most are not charted, due to being defined as range-maps from Unicode, possibly with the exclusion of otherwise-included characters from the range. CPGID 1391 is not charted at all, perhaps understandably.

The 1997-02 version defines only CPGIDs 837 and 1385 (and GCSGIDs 1084 and 1085), and supports the GBK repertoire. The 2001-03 version defined GCSGIDs 2084 and 2085 for the 2-byte and 4-byte components of GB18030, in addition to the GCSGID 2086 for an expanded CPGID 837 repertoire supporting Uyghur (Perso-Arabic), Tibetan, Mongolian and Yi characters, as well as the entirety of CJK Extension A. The 2001-12 version defined GCSGIDs 2087 and 2102, adding positional presentation forms for the Perso-Arabic letters, and completed coverage of the GB 18030 vertical presentation forms for punctuation.

While not defined by any available version of the specification itself, GCSGID 2103 is attested in AFP-related contexts as referring to the GCSGID 2102 set minus user-defined characters, and GCSGID 2081 as referring to the maximal DBCS-Host set from the 1997-02 version of C-H 3-3220-020.

C-H 3-3220-020 2001-12

C-H 3-3220-020 2001-03

C-H 3-3220-020 1997-02

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-024: IBM Japanese Graphic Character Set, Kanji

Planes defined
CPGID 300, CPGID 301, CPGID 65280, CPGID 65281, CPGID 65282
Repertoires defined
GCSGID 370, GCSGID 540, GCSGID 571, GCSGID 572, GCSGID 574, GCSGID 575, GCSGID 576, GCSGID 577, GCSGID 1000, GCSGID 1001, GCSGID 1002, GCSGID 1003, GCSGID 2053, GCSGID 2059, GCSGID 2092, GCSGID 2093

The 2002-11 version expanded the set, adding GCSGIDs 2092, 2093 and 2059 and expanding CPGID 300 to encode them. Note that while the 2002-11 revision includes Unicode mappings, these do not always agree with C-H 3-3220-126 2016-04 in terms of Unicode mappings for GCGIDs. They are better understood as a convention for mapping Japanese double-byte EBCDIC (IBM, not JEF or KEIS) to Unicode and also to GCGIDs, not for mapping GCGIDs to Unicode per se.

In AFP contexts, the private-use CPGIDs 65280, 65281 and 65282 are assigned to duplicates of CPGID 300, seemingly to signal the use of different reference glyph standards (although the code page data is, besides the CPGID number itself, identical for all of them).

C-H 3-3220-024 2002-11

C-H 3-3220-024 1999-03

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-030: IBM Korean Graphic Character Set [11172 syllables]

Planes defined
CPGID 834, CPGID 1362
Repertoires defined
GCSGID 1093, GCSGID 1094, GCSGID 1097, GCSGID 1098, GCSGID 2063

While not defined by the specification itself, GCSGID 2063 is attested in AFP-related contexts as referring to the maximal set of CPGID 834 as defined in C-H 3-3220-030.

C-H 3-3220-030 1999-04

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-050: REGISTRY, Graphic Character Sets and Code Pages

See planes (CPGID), repertoires (GCSGID).

C-H 3-3220-055: Graphic Character Identification System, Graphic Character Global Identifier (GCGID) Structure

Information on this infrastructure has its own page: Information relevant to C-H 3-3220-055

C-H 3-3220-125: IBM Korean Graphic Character Set [2350 or 2620 syllables]

Planes defined
CPGID 834, CPGID 926, CPGID 951, CPGID 951 (ICU), CPGID 65283
Repertoires defined
GCSGID 934, GCSGID 1010, GCSGID 1011, GCSGID 1012, GCSGID 1013, GCSGID 1014, GCSGID 1015, GCSGID 1050, GCSGID 1051, GCSGID 1052, GCSGID 1053, GCSGID 1054, GCSGID 1055, GCSGID 1056, GCSGID 1057

The full CPGID 834 is defined in C-H 3-3220-030; this specification defines only certain subsets of CPGID 834, used in e.g. CCSID 933. In AFP contexts, the subset of CPGID 834 defined by this specification is assigned the private-use CPGID 65283.

CPGID 926 is referenced to the (unavailable) 1989-10 version. The only publicly available source for CPGID 926 I know of is an ICU mapping for CCSID 944 which was removed from ICU two decades ago.

C-H 3-3220-125 1992-09

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-126: IBM Traditional Chinese Graphic Character Set

Planes defined
CPGID 835, CPGID 927
Repertoires defined
GCSGID 935, GCSGID 1030, GCSGID 1031, GCSGID 1032, GCSGID 1033, GCSGID 1034, GCSGID 1035, GCSGID 2074, GCSGID 2079, GCSGID 2131

The 2016-04 version documented GCSGID 2079 (the maximal set of the 1999-04 version), added the larger GCSGID 2131, expanded CPGID 835 to encode it, and dropped CPGID 927. Note that while the 2016-04 revision includes Unicode mappings, these do not always agree with C-H 3-3220-024 2002-11 in terms of Unicode mappings for GCGIDs. They are better understood as a convention for mapping Traditional Chinese double-byte EBCDIC to Unicode and also to GCGIDs, not for mapping GCGIDs to Unicode per se.

C-H 3-3220-126 2016-04

C-H 3-3220-126 1999-04

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-127: IBM Japanese Graphic Character Set for Extended UNIX Code (EUC)

Planes defined
CPGID 952, CPGID 953
Repertoires defined
GCSGID 1058, GCSGID 1059, GCSGID 1060, GCSGID 1061, GCSGID 1062, GCSGID 2096

The 2009-12 revision extends CPGID 952 to add NEC row 13, thus creating the new GCSGID 2096. Note that in either case, it is only the portions of the respective CPGIDs used in CCSID 5050, not CCSID 954, that is charted, i.e. much of JIS X 0212 is not explicitly charted.

C-H 3-3220-127 2009-12

C-H 3-3220-127 1993-03

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-128: IBM Korean Graphic Character Set for Extended UNIX Code (EUC)

Planes defined
CPGID 971
Repertoires defined
GCSGID 1051, GCSGID 1056

C-H 3-3220-128 1993-11

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-129: IBM Traditional Chinese Graphic Character Set for Extended UNIX Code (EUC)

Planes defined
CPGID 960, CPGID 961
Repertoires defined
GCSGID 1070, GCSGID 1071, GCSGID 1073

C-H 3-3220-129 1993-11

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-130: IBM Simplified Chinese Graphic Character Set

Planes defined
CPGID 837, CPGID 928, CPGID 1380, CPGID 65284
Repertoires defined
GCSGID 937, GCSGID 1020, GCSGID 1021, GCSGID 1022, GCSGID 1023, GCSGID 1024, GCSGID 1025, GCSGID 1080

CPGID 928 is referenced to the (unavailable) 1992-11 version, although at least one ICU mapping containing it does exist, but outside of the main source tree. It seems to have been a superset of the encoding described by Mark Leisher's SHIFTGB.TXT, which was "Generated from an algorithm provided with some older Chinese packages." in March 1998.

Furthermore, the charts for CPGID 837 are split between C-H 3-3220-130 and C-H 3-3220-020, with C-H 3-3220-130 charting CPGID 837 as of 1993, and any additions being charted (or in some cases merely described) in C-H 3-3220-020. In AFP contexts, the subset of CPGID 837 defined by this specification alone (possibly with the addition of the Euro sign at 0x45EB) is assigned the private-use CPGID 65284.

C-H 3-3220-130 1993-11

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-131: IBM Traditional Chinese Graphic Character Set for IBM BIG-5 Code

Planes defined
CPGID 947
Repertoires defined
GCSGID 935, GCSGID 1030, GCSGID 1075, GCSGID 1076, GCSGID 1077, GCSGID 1077 (common subset)

C-H 3-3220-131 1999-04

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-132: IBM Simplified Chinese Graphic Character Set for Extended UNIX Code (EUC)

Planes defined
CPGID 1382
Repertoires defined
GCSGID 1080, GCSGID 1081

C-H 3-3220-132 1994-06

Reconstructed machine-readable definitions and/or other relevant files

C-H 3-3220-133: IBM Japanese Graphic Character Set, Kanji, for Open Environment

Planes defined
CPGID 941, CPGID 1351
Repertoires defined
GCSGID 370, GCSGID 1000, GCSGID 1061, GCSGID 1066

C-H 3-3220-133 1996-08

Reconstructed machine-readable definitions and/or other relevant files

Other DBCS code pages

Planes defined
CPGID 955, CPGID 963, CPGID 1372, CPGID 1374, CPGID 1376, CPGID 1393
Repertoires defined
GCSGID 1020 (Euro), GCSGID 1021 (AFP), GCSGID 1063, GCSGID 1064, GCSGID 2110, GCSGID 2111, GCSGID 2112, GCSGID 2113, GCSGID 2114, GCSGID 2115

CPGIDs 1372, 1374, 1376 are Traditional Chinese (Big5 Microsoft, Big5 HKSCS, EBCDIC HKSCS); specification documents are not publicly available for these, which isn't an issue for the well-documented Big5 planes. The layout of the EBCDIC HKSCS plane is known since the ICU project has, outside of the main source tree, mappings for CCSID 9569 ("IBM-1377 2006 version") and CCSID 13665 ("IBM-1377 2008 version"). GCSGIDs 2110 through 2115 are associated with HKSCS.

CPGID 955 (with GCSGID 1063), GCSGID 1064, and CPGID 1393 are Japanese. CPGID 963 is supplemental for Traditional Chinese. GCSGIDs 1020-ALT and 1021-ALT are used in AFP Simplified Chinese contexts.