Han Xin code explained

Han Xin code (汉信码 in Chinese, Chinese-sensible code) is two-dimensional (2D) matrix barcode symbology invented in 2007[1] by Chinese company The Article Numbering Center of China[2] (中国物品编码中心 in Chinese) to break monopoly of QR code. As QR code, Han Xin code consists of black squares and white square spaces arranged in a square grid on a white background. It has four finder patterns and other markers which allow to recognize it with camera-based readers. Han Xin code contains Reed–Solomon error correction with ability to read corrupted images. At this time, it is issued as ISO/IEC 20830:2021.[3]

The main advantage (and invention requirement), comparable to QR code, is an embedded ability to natively encode Chinese characters instead of Japanese in QR code. Han Xin code in maximal 84 version (189×189 size)[4] allows to encode 7827 numeric characters, 4350 English text characters, 3261 bytes and 1044–2174 Chinese characters (it depends on Unicode region). Han Xin code encodes full ISO/IEC 646 Latin characters instead of restricted amount Latin characters which is supported by QR code. It makes Han Xin code more suitable for English text encoding or GS1 Application Identifiers[5] data encoding.

Additionally, Han Xin code can encode Unicode characters from other languages with special Unicode mode,[3] which has embedded lossless compression for UTF-8 characters set and Extended Channel Interpretation support. Han Xin code has special compactification mode for URI encoding and can reduce barcode size which encodes links to web pages.

History and standards

Chinese company The Article Numbering Center of China (中国物品编码中心 in Chinese) during 10-th Five-year plans of China started research[6] of own QR code replacement to remove Japanese monopoly in 2D barcodes. In 2007, the new barcodes standard, at this time known as Han Xin code, published as GB/T 21049-2007[1] with the name Chinese-sensible code.

In 2011,[7] USA company Association for Automatic Identification and Mobility (AIM) brought out ISS Han Xin Code symbology as official encoding standard and published it in the own store.[8]

In 2015, group of ISO/IEC JTC 1/SC 31 started implementation[9] of Han Xin code as international standard and published it as ISO/IEC 20830:2021[3] in 2021.

In 2022 Chinese-sensible code standard was reviewed as GB/T 21049-2022[10] and renamed as Han Xin code to be compliant with ISO standard.

Set of patents is registered in United States Patent and Trademark Office related with Han Xin code encoding and decoding:

Application

Han Xin code can be used in the same way as QR code. At this time Han Xin code is used mostly in China,[14] because it has embedded encoding ability to encode Chinese characters. However, most of barcode printers[15] and barcode scanners[16] support Han Xin code. Han Xin code can be scanned on iOS[17] and Android[18] mobile devices and many barcode libraries[19] [20] support reading and writing Han Xin code.

Main advantages of Han Xin code are:

Barcode design

Han Xin code represents data in black and white square modules, where dark module is a binary one and a light module is a zero. Additionally, Han Xin code can be encoded in inverse colors,[3] but this option in many barcode readers is disabled by default. Black and white modules are arranged into square region with sizes from 23 × 23 modules (Version 1) to 189 × 189 modules (Version 84). As QR code, Han Xin code does not have rectangular versions like DataMatrix has and this restricts usage of Han Xin code in some cases. Han Xin code version size can be calculated with the following formula:

Size=23+(Version-1)*2

Han Xin code symbol is constructed from the following elements:[3]

Finder pattern

Finder Pattern[3] consists from four Position Detection Patterns located at the four corners of the barcode. The size of Position Detection Pattern is 7×7 modules and it is constructed from 5 elements: dark 7 × 7 modules, light 6 × 6 modules, dark 5 × 5 modules, light 4 × 4 modules, dark 3 × 3 modules respectively.

The scanning ratio of each Position Detection Pattern is 1:1:1:1:3 or 3:1:1:1:1 (depends on scanning direction). The four patterns orientation allows to detect unambiguously the barcode location and orientation.

Every pattern has Position Detection Pattern separator[3] with Structural Information Region aligned to it.

Alignment pattern

The Alignment Patterns[3] are added to the Han Xin code from Version 4 (Versions 1–3 do not have alignment patterns) and used to precise cell position in the distorted barcodes. Alignment Patterns in Han Xin code are split into:

The Alignment Pattern is made up of a dark line and a downside adjacent light line which are one module wide. Assistant Alignment Pattern consisting from 5 light modules and 1 dark module indicates edge of region block with its dark module.

Below you can see examples of Han Xin code with different Alignment pattern placement.

Structural information

Han Xin code Structural Information Region[3] is a one module wide region surrounding the four Position Detection Patterns. Han Xin code has two Structural Information identical arrays, which are made from 34 data modules. Every Structural Information array is split on 17 modules which are placed around each Position Detection Pattern.

Structural Information Region encodes the following data:[3]

Metadata bits from 0–11 are split into 4 bits tetrads(m2, m1, m0) and supplemented with four error correction tetrads (r3, r2, r1, r0).

Data masking

To make Han Xin code dark and light modules amount to be closely to 1:1 in the symbol, masking algorithm[3] is used. Masking sequence is applied to Data Region through the XOR operation. Finder Pattern, Alignment Patterns and Structural Information Regions are excluded from masking operation. The following table shows mask pattern algorithms (which is placed to Structural Information Region).

Han Xin Code masking pattern algorithm
Condition of masking solutionData mask pattern reference
Non-masking00
(i+j) mod 2=001
((i+j)mod 3+(j mod 3)) mod 2=010
(i mod j +j mod i + i mod 3+ j mod 3) mod 2=011
i - Row index of the symbol.
j - Column index of the symbol.
Both and start from (1,1), the top left corner module of the symbol. When the masking solution condition is true, the resulting mask bit is 1.

Error correction

Han Xin code uses Reed–Solomon error correction. Encoded data is represented as byte (8-bit) array. Data array divided into blocks[3] and error correction codewords sequence is generated for each block which is added to the end of the error correction block. After this, all blocks are merged sequentially into byte stream.

The polynomial arithmetic for Han Xin Code uses finite field generation polynomial: x^8 + x^6 + x^5 + x (355 or 101100011b)[3] with initial root = 1.

The amount of error correction codewords depends on symbol version and error correction level and can be from 16% to 60%, which allows to correct from 8% to 30% damage.[3]

Han Xin Code error correction levels features
Error correction levelRecovery capacity % (approximation)Encoding of error correction level
L18%00
L215%01
L323%10
L430%11

Data region

Han Xin code data is encoded as byte array. Data byte array is split into error correction blocks, where error correction codewords (bytes) are added. Error correction blocks are united into one codewords array:[3]

(Encoded byte array) => (Error correction block 1) + ... + (Error correction block N) => (Codewords array)

As an example, this can be demonstrated on Han Xin code version 5 with error correction level L4. It has 27 encoded codewords and 2 error correction blocks with each block size of data codewords and error correction codewords: (14, 20), (13, 22):

(D1...D14, D15...D27) => (D1...D14, E1.1...1.20) + (D15...D27, E2.1...2.22) => (D1...D14, E1.1...1.20, D15...D27, E2.1...2.22) => (C1...C69)
D(x) - Data codewords.
E(b.x) - error codeword, where b is block number and x position in block.
C(x) - resulted codewords.

As the next operation, resulted codewords array C(x) is split into blocks with size of 13 bytes which connects codewords in the same position of each block and form new codewords array. The result is byte array of the same size but mixed by position of 13.

(С1...С13, С14...С26, Сn...Cn+12) => (С1, C14, Cn...С13, С26, Cn+12) => (CM1...CMn+12)
CM(x) – mixed by position of 13 array of codewords (bytes).

After the upper operations the resulted codewords are placed into data region row by row from left to right and from up to down. Horizontal line damage would affect fewer codewords, vertical line damage would affect more codewords.

Encoding

Han Xin code can encode 7827 numeric characters, 4350 English text characters, 3261 bytes and 1044–2174 Chinese characters in the maximal version 84 version.[3] Additionally, it supports special Unicode and industrial modes. All modes can be mixed to obtain best compactification level for the data. The following table demonstrates abilities to encode data with different barcode version and error correction level.

Han Xin Code versions and information capacity
VersionSizeError correction levelData codewordsError correction codewordsNumericTextBytesChinese characters
123×23L12144526186–12
L4916151062–4
...
2265×65L135468843470351113–234
L416825439922216553–110
...
84189×189L132646227827435032611044–2174
L415542332372320701551497–1034

Encoding modes

All encoding modes can be split into the following groups:[3]

Han Xin Code mode characteristics
ModeMode indicatorsBits per character
Numeric0001b3.3 (10 bits for three digits)
Text0010b6
Binary Byte0011b8
Common Chinese Characters in Region One0100b12
Common Chinese Characters in Region Two0101b12
GB18030 2-byte Region0110b15
GB18030 4-byte Region0111b21
ECI1000bVariable (multi-bytes mode)
Unicode1001bAdaptive (lossless compression)
GS111100001bVariable (Numeric + Text modes)
URI11100010bVariable (2–7 bits per character)

Numeric mode

The input data string in Numeric mode[3] is divided into blocks of three digits (the last block can be less than three) and encoded in 10 bits (0000000000b - 1111100111b). The mode data is prefixed with mode indicator 0001b and terminates with mode terminator which also indicates number of digits in last group.

Han Xin Code numeric mode terminators
Numeric characters in last groupMode terminator
11111111101b
21111111110b
31111111111b

As an example, we need to encode digits sequence 12700402:
Prefix => 0001b
127 => 0001111111
004 => 0000000100
02 => 0000000010
Terminator => 1111111110b

Text mode

Text mode encodes data characters set from ISO/IEC 646. Each character is represented by 6 bits.[3] All characters are divided into two subsets: Text1 sub-mode and Text2 sub-mode. 11110b value is used to switch between text sub-modes, 111111b is a mode terminator. Text mode starts from Text1 sub-mode.

Han Xin Code Text1 sub-mode
CharacterASCII valueEncoding valueCharacterASCII valueEncoding valueCharacterASCII valueEncoding value
048000000bL76010101bg103101010b
149000001bM77010110bh104101011b
250000010bN78010111bi105101100b
351000011bO79011000bj106101101b
452000100bP80011001bk107101110b
553000101bQ81011010bl108101111b
654000110bR82011011bm109110000b
755000111bS83011100bn110110001b
856001000bT84011101bo111110010b
957001001bU85011110bp112110011b
A65001010bV86011111bq113110100b
B66001011bW87100000br114110101b
C67001100bX88100001bs115110110b
D68001101bY89100010bt116110111b
E69001110bZ90100011bu117111000b
F70001111ba97100100bv118111001b
G71010000bb98100101bw119111010b
H72010001bc99100110bx120111011b
I73010010bd100100111by121111100b
J74010011be101101000bz122111101b
K75010100bf102101001b
Han Xin Code Text2 sub-mode
CharacterASCII valueEncoding valueCharacterASCII valueEncoding valueCharacterASCII valueEncoding value
NUL0000000bNAK21010101b.46101010b
SOH1000001bSYN22010110b/47101011b
STX2000010bETB23010111b58101100b
ETX3000011bCAN24011000b59101101b
EOT4000100bEM25011001b<60101110b
ENQ5000101bSUB26011010b=61101111b
ACK6000110bESC27011011b>62110000b
BEL7000111bSP32011100b?63110001b
BS8001000b!33011101b@64110010b
HT9001001b34011110b[||91||110011b |- |LF||10||001010b||#||35||011111b||\||92||110100b |- |VT||11||001011b||$||36||100000b||]93110101b
FF12001100b%37100001b^94110110b
CR13001101b&38100010b_95110111b
SO14001110b39100011b`96111000b
SI15001111b(40100100b
123111001b
DLE16010000b)41100101b124111010b
DC117010001b42100110b
125111011b
DC218010010b+43100111b~126111100b
DC319010011b,44101000bDEL27111101b
DC420010100b-45101001b

Binary byte mode

Binary mode encodes bytes array [0 – 255] in any form. Binary mode[3] consists from binary mode indicator 0011b, 13-bit binary counter and bytes data which are converted to 8-bit sequence. None mode terminator is required.

Chinese Characters modes

Chinese Characters modes is a set of 4 modes which encodes Chinese characters from GB 18030 codepage.

Han Xin Code Chinese Characters modes
ModeMode indicatorBitsEncoding characters countDescription
Common Chinese Characters in Region One mode0100b124074Encodes characters from GB 18030 regions, which: first byte value is in the range of B0 to D7 and second byte value is in the range of A1 to FE (3760 characters), first byte value is in the range of A1 to A3 and second byte value is in the range of A1 to FE (282 characters), in the range of A8A1 to A8C0 (32 characters).
Common Chinese Characters in Region Two mode0101b123008Encodes characters from GB 18030 region, which first byte value is in the range of D8 to F7 and second byte value is in the range of A1 to FE (3008 characters).
GB18030 2-byte Region mode0110b1523940Encodes characters from GB 18030 region, which first byte value is in the range of 81 to FE and second byte value is in the range of 40 to 7E or 80 to FE (23940 characters).
GB18030 4-byte Region mode0111b211587600Encodes characters from GB 18030 region, which first byte value is in the range of 81 to FE, and second byte value is in the range of 30 to 39, and third byte value is in the range of 81 to FE, and fourth byte value is in the range of 30 to 39 (1587600 characters).

Unicode mode

Unicode mode[3] encodes UTF-8 charset with embedded lossless compression. In the Unicode mode, the input data is analysed by using self-adaptive algorithm. Firstly, input data is divided and combined into the 1, 2, 3, or 4 byte pattern preencoding sub-sequences, and secondly a run-length data compression algorithm is applied to encode each sub-sequences of the input data.

Shortly, the Unicode mode searches characters sub-pages which can have the same prefix sequence for all of characters of the same language (Cyrillic, Greek, French, German... languages) and encodes only differences from prefix bytes sequence.

GS1 mode

Han Xin code GS1 mode[3] is an indicator that the represented data is defined by GS1 General Specification. GS1 mode encodes data in Numeric and Text modes. Other modes may be used but GS1 mode must be firstmode in the symbol and encoded data must be returned with GS1 flag. <FNC1> (if required) must be encoded as 1111101000b in Numeric mode (Numeric mode encodes only three digits, so 1111101000b => 1000 value is counted as special character). In case <FNC1> identifier must be inserted and encoder is in any mode different from Numeric, the mode must be terminated and Numeric mode must be started. GS1 mode indicator is 11100001b and GS1 mode terminator is 11111111b.

The data in GS1 mode is split into GS1 Application Identifiers chinks and then compacted with the best modes. As an example, the following data can be encoded:
(10)123456ABC<FNC1>(240)DATA

The data is encoded in the following way:
<11100001b> <Numeric 10123456> <Text ABC> <Numeric mode selector> <1111101000b> <Numeric 240> <Text DATA> <11111111b>

URI mode

Han Xin code URI mode[3] encodes URI links in compact encoding. URI mode indicator is 11100010b and URI mode terminator is 111b. URI mode can encode data in three charsets: URI-A, URI-B, URI-C[3] with own sub-mode terminators. URI mode can encode %XX data in special Percent-Encoding sub-mode, where three symbols is encoded in 8 bits.

Han Xin Code URI submodes
CharsetCharset indicator
URI-A001b
URI-B010b
URI-C011b
Percent-Encoding100b
URI Mode Teminator111b

Percent-Encoding sub-mode encodes %XX data in 8 bits sequence. The mode does not require any terminator. To encode URI %XX data in this mode, sub-mode indicator (100b) must be added, then 8-bit indicator of sub-mode 8 bits sequence must be added (counter = Length of %XX / 3) and after this sequence, where %FF, or %ff, or %00, must be added as xFF or x00 bytes.

Han Xin Code URI-A and URI-B charsets
URI-A charsetURI-B charset
Character / URI fragmentEncoding valueEncoding bitsCharacter / URI fragmentEncoding valueEncoding bits
a0000000A0000000
b1000001B1000001
c2000010C2000010
d3000011D3000011
e4000100E4000100
f5000101F5000101
g6000110G6000110
h7000111H7000111
i8001000I8001000
j9001001J9001001
k10001010K10001010
l11001011L11001011
m12001100M12001100
n13001101N13001101
o14001110O14001110
p15001111P15001111
q16010000Q16010000
r17010001R17010001
s18010010S18010010
t19010011T19010011
u20010100U20010100
v21010101V21010101
w22010110W22010110
x23010111X23010111
y24011000Y24011000
z25011001Z25011001
026011010! 26011010
12701101127011011
228011100(28011100
329011101)29011101
430011110,30011110
53101111132100000
73310000133100001
834100010\34100010
935100011^35100011
.36100100[||36||100100 |- |/||37||100101||]37100101
- 38100110'38100110
_39100111<39100111
~40101000>40101000
41101001%41101001
@42101010"42101010
?4310101143101011
44101100.htm44101100
=45101101.html45101101
+ 46101110.asp46101110
$47101111.aspx47101111
&48110000.php48110000
http:// 49110001.jsp49110001
https:// 50110010gtin50110010
ftp:// 51110011ser51110011
mailto: 52110100bat52110100
ldap:// 53110101exp53110101
tel: 54110110search54110110
urn: 55110111id55110111
www.56111000.jp56111000
.com57111001.it57111001
.net58111010.de58111010
.gov59111011.br59111011
.org60111100.fr60111100
.cn61111101gs161111101
Jump to URI-B62111110Jump to URI-A62111110
Terminator of URI-A63111111Terminator of URI-B63111111
Han Xin Code URI-C charset
Character / URI fragmentEncoding valueEncoding bitsCharacter / URI fragmentEncoding valueEncoding bitsCharacter / URI fragmentEncoding valueEncoding bits
A00000000R430101011861010110
B10000001S440101100/871010111
C20000010T450101101?881011000
D30000011U460101110891011001
E40000100V470101111@901011010
F50000101W480110000&911011011
G60000110X490110001=921011100
H70000111Y500110010http://931011101
I80001000Z510110011https://941011110
J900010010520110100ftp://951011111
K1000010101530110101mailto:961100000
L1100010112540110110ldap://971100001
m1200011003550110111tel:981100010
N1300011014560111000urn:991100011
O1400011105570111001www.1001100100
P1500011116580111010.com1011100101
Q1600100007590111011.net1021100110
R1700100018600111100.gov1031100111
S1800100109610111101.org1041101000
T190010011$620111110.cn1051101001
U200010100-630111111.htm1061101010
V210010101_641000000.html1071101011
w220010110.651000001.asp1081101100
X230010111+661000010.aspx1091101101
Y240011000!671000011.php1101101110
Z250011001681000100.jsp1111101111
A260011010(691000101gtin1121110000
B270011011)701000110ser1131110001
C280011100,711000111bat1141110010
D290011101731001001search1161110100
F310011111741001010id1171110101
G320100000\751001011.jp1181110110
H330100001^761001100.it1191110111
I340100010~771001101.de1201111000
J350100011[||78||1001110||.br||121||1111001 |- |K||36||0100100||]791001111.fr1221111010
L370100101'801010000gs11231111011
M380100110<811010001search1241111100
N390100111>821010010Jump to URI-A1251111101
O400101000831010011Jump to URI-B1261111110
P410101001%841010100Terminator of URI-C1271111111
Q420101010"851010101

See also

External links

Notes and References

  1. Web site: GB/T . GB/T 21049-2007 "Chinese-sensible code" . GB/T 21049-2007 . www.chinesestandardslibrary.com . . 2007 . Chinese.
  2. Web site: 中国物品编码中心 (The Article Numbering Center of China) . www.ancc.org.cn . Chinese.
  3. Web site: ISO/IEC . ISO/IEC 20830:2021 "Information technology Automatic identification and data capture techniques Han Xin Code bar code symbology specification" . ISO/IEC 20830 . iso.org . . 2021.
  4. Web site: Stefania Zocco . QR codes in contemporary China: digital money and people's perception . Ca'Foscari University of Venice . dspace.unive.it.
  5. Web site: GS1 Application Identifiers . www.gs1.org.
  6. Web site: Dong Xiaowen . Deng Huipeng . Wang Li . 中国主导的首个二维码码制国际标准正式发布(The first international standard for QR code coding led by China is officially released) . 中国物品编码中心 (The Article Numbering Center of China) . www.ancc.org.cn . 31 August 2021 . Chinese.
  7. Web site: RFID and AIDC News: New Bar Code Symbology for Double Byte Characters . www.scdigest.com . Supply Chain Digest.
  8. Web site: ISS Han Xin Code symbology specification - Rev. 3.0 . AIM Global . aimglobal.org.
  9. Web site: Liu Jia . 汉信码正式成为国际ISO标准工作项目(Hanxin code officially becomes an international ISO standard work item) . 中国物品编码中心 (The Article Numbering Center of China) . www.ancc.org.cn . 16 September 2015 . Chinese.
  10. Web site: GB/T . GB/T 21049-2022 "Han Xin code" . GB/T 21049-2022 . www.chinesestandard.net . . 2022 . Chinese.
  11. Web site: Shengzhang Jiang . Weidong Wu . European Patent Office EP3330887B1 by Fujian Landi Commercial Equipment Co Ltd "Chinese-sensitive code feature pattern detection method and system" . patents.google.com . European Patent Office . 2 August 2016.
  12. Web site: Shengzhang Jiang . Weidong Wu . United States Patent US10095903B2 by Ingenico Fujian Technology Co Ltd "Block decoding method and system for two-dimensional code" . patents.google.com . United States Patent and Trademark Office . 15 January 2018.
  13. Web site: Shengzhang Jiang . Weidong Wu . United States Patent US10528781B2 by Ingenico Fujian Technology Co Ltd "Detection method and system for characteristic patterns of Han Xin codes" . patents.google.com . United States Patent and Trademark Office . 13 February 2018.
  14. Web site: Han Xin Code . www.ancc.org.cn . GS1 China.
  15. Web site: PC42D Desktop Direct Thermal Barcode Printer . www.honeywell.com.
  16. Web site: Unitech MS852B . dcs.aero.
  17. Web site: Shi Yu . Han Xin Code . han-xin-code.appstor.io . Chinese.
  18. Web site: Zheng Yu . 中国的二维码,您用了吗(Have you used China's QR code)? . 中国物品编码中心 (The Article Numbering Center of China) . www.ancc.org.cn . 2 September 2013 . Chinese.
  19. Web site: Generate Han Xin Code Barcodes in C# . www.aspose.com.
  20. Web site: AIM International Technical Specification - Han Xin Code Encoding Library for .Net . github.com.
  21. Book: Xiaolei Yu . Donghua Wang . Zhimin Zhao . Semi-physical Verification Technology for Dynamic Performance of Internet of Things System . Springer . 181 . 2018 . 978-9811317590.