Optical Character Recognition (Unicode block) explained

Blockname:Optical Character Recognition
Rangestart:2440
Rangeend:245F
Script1:Common
Symbols:OCR controls
Sources:ISO 2033
1 0 0:11
Note:[1] [2]

Optical Character Recognition is a Unicode block containing signal characters for OCR and MICR standards.

Block

Subheadings

The Optical Character Recognition block has three informal subheadings (groupings) within its character collection: OCR-A, MICR, and OCR.[3]

OCR-A

The OCR-A subheading contains six characters taken from the OCR-A font described in the ISO 1073-1:1976 standard:,,,,, and . The OCR bow tie is given the informative alias "unique asterisk".

The hook, chair and fork, in addition to a long vertical bar, are included in the most basic "numeric" implementation level of OCR-A, which includes digits but excludes letters and conventional punctuation.[4] By contrast, the most basic implementation level of OCR-B instead includes the digits, plus sign, less-than sign, greater-than sign, long vertical bar and seven of the capital letters;[5] as such, there are no characters specific to OCR-B in the Optical Character Recognition block.

MICR

The MICR subheading contains four punctuation characters for bank cheque identifiers, taken from the magnetic ink character recognition E-13B font (codified in the ISO 1004:1995 standard):,,, and .

The latter two characters are misnamed: their names were inadvertently switched when they were named in the 1993 (first) edition of ISO/IEC 10646, a mistake which had been present since Unicode 1.0.0.[6] Although their formal names remain unchanged due to the Unicode stability policy, they both have corrected normative aliases: U+2448 ⑈ is, and U+2449 ⑉ is (the standard notes that "the Unicode character names include several misnomers").

These symbols had previously been encoded by the ISO-IR-98 encoding defined by ISO 2033:1983, in which they were simply named through .[7] All four characters have informative aliases in the Unicode charts: "transit", "amount", "on us", and "dash" respectively.

OCR

The OCR subheading consists of a single character: .

History

The following Unicode-related documents record the purpose and process of defining specific characters in the Optical Character Recognition block:

Notes and References

  1. Web site: Unicode character database. The Unicode Standard. 2023-07-26.
  2. Web site: Enumerated Versions of The Unicode Standard. The Unicode Standard. 2023-07-26.
  3. Web site: Unicode Code Charts: Optical Character Recognition. The Unicode Standard, Version 6.3. 27 February 2014.
  4. Web site: Nominal Character Dimensions of the Numeric OCR-A Font . 2nd . ECMA-8 . 1977 . European Computer Manufacturers Association . Ecma International.
  5. Web site: 9.1: Subset 1: Minimal alphanumeric subset . 8 . Proposal for Type 3 Technical Report, TR 15907, Information technology—Revision of OCR-B standard (ISO 1073-2:1976) . ISO/IEC JTC1/SC2/WG3 N470 . 1998-09-28 . ISO/IEC JTC1/SC2/WG3 . ISO/IEC JTC 1/SC 2.
  6. Web site: 3.8: Block-by-Block Charts . The Unicode Standard . version 1.0 . Unicode Consortium.
  7. 98 . E13B Graphic Character Set . yes . ISO/TC97/SC2 . ISO/IEC JTC 1/SC 2#History . 1985-08-01.