Purpose
Opcode format | Opcode
[ASCII] (Hex) |
Operand Format | Comments |
Extended ASCII | (Text | <ws><IX>,<IY><ws><TStr>\
[<ws>(Overscore<ws>(<IOS-Count><ws><IOS-Pos0>[,<IOS-Posi>]*))]\ [<ws>(Underscore<ws>(<IUS-Count><ws><IUS-Pos0>[,<IUS-Posi>]*))]\ [<ws>(CharPos<ws>(<IRES-Count><ws><IRES-Pos0>[,<IRES-Posi>]*))]\ [<ws>(Bounds<ws><IP0x>,<IP0y><ws><IP1x>,<IP1y><ws><IP2x>,<IP2y>\ <ws><IP3x>,<IP3y>)]\ [<ws>]) |
Absolute coordinates. |
Single-byte, binary operand | [Ctrl-X](18) | <Lx><Ly><TStr>\
<BOS-Count>[<USOS-Ecount>][<BOS-Posi>[<USOS-Eposi>]]*\ <BUS-Count>[<USUS-Ecount>][<BUS-Posi>[<USUS-Eposi>]]*\ <BRES-Count>[<USRES-Ecount>][<BRES-Posi>[<USRES-Eposi>]]*\ size=-2><LP0x><LP0y><LP1x><LP1y><LP2x><LP2y><LP3x><LP3y> |
Advanced text, relative coordinates. |
- | [x] (78) | <Lx><Ly><TStr> | Basic text, relative coordinates. |
Str — The text string to be drawn, encoded either as an ASCII value, or as Unicode (as documented by the <T> mnemonic).
OS-Count — One plus the number of overscore position indices for the text. (A value of one indicates no overscores). A value of zero indicates that an extended count will follow.
OS-Ecount — When os-count is zero, a two-byte extended count follows. This allows for an overscore index count of 256 through 65791 which are encoded as an integer in the range 0 to 65,535.
OS-Pos-i — The ith overscore position index. A value of zero indicates that an extended count will follow.
OS-Epos-i — When os-pos-i is zero, a two-byte extended count follows. This allows for positions of 256 through 65791 which are encoded as an integer in the range 0 to 65,535.
US-Count — One plus the number of undererscore position indices for the text. (A value of one indicates no overscores). A value of zero indicates that an extended count will follow.
US-Ecount — When us-count is zero, a two-byte extended count follows. This allows for underscore index count of 256 through 65791 which are encoded as an integer in the range 0 to 65,535.
US-Pos-i — The ith underscore position index. A value of zero indicates that an extended count will follow.
US-Epos-i — When us-pos-i is zero, a two-byte extended count follows. This allows for positions 256 through 65791 which are encoded as an integer in the range 0 to 65,535.
RES-Count — RESERVED. Should always equal one. This value is one plus the number of reserved values. (A value of one indicates no reserved values). A value of zero indicates that an extended count will follow.
RES-Ecount — When res-count is zero, a two-byte extended count follows. This allows for an index count of 256 through 65791 which are encoded as an integer in the range 0 to 65,535.
RES-Pos-i — The ith reserved value index. A value of zero indicates that an extended count will follow.
RES-Epos-i — When res-pos-i is zero, a two-byte extended count follows. This allows for values 256 through 65791 which are encoded as an integer in the range 0 to 65,535.
P0x,P0y...P3x,P3y — Specifies the text bounding box in logical coordinates relative to the insertion point. This is shown in figure 1, following.
For more information on working with ASCII and UNICODE charaters, see Notes, below.
Figure 1. Text string bounding box
Working with international text is not as straightforward as it may seem. It may help you to know a little about character encoding and display before you begin using international text in WHIP! data. These notes are provided as a brief overview of supporting international text.Using International Character Sets
Text characters are written to a file as either single-byte characters, double-byte characters, or both single and double-byte character sets. The method used to encode characters usually depends on the written local language of the operating system.English has a relatively small character set of only 256 written characters. The 256 English characters are commonly referred to as the ASCII character set. ASCII characters can each be represented in one byte (8 bits) of computer memory. ASCII has a limited number of characters, but most written languages have many more characters than 256. As a result, international languages (other than English) must be represented using two byte (16 bits) character sets.
Note: Although computer applications could simply represent all languages using two bytes of memory per character, this is not done for the English character set since ASCII characters would consume twice as much memory as is needed, and resulting file sizes would be unnecessarily large.
Using international character sets is complicated by the fact that there are two different ways to encode multiple bytes of computer memory for text: Unicode and Multi-Byte Character Set (MBCS):
Operating Systems and International CharactersUnicode is a map of characters in which each character corresponds to a unique two byte value. The Unicode character map contains characters from most of the world's languages.Unicode character values are always two bytes and there is only one mapping such that a given number always maps to the same character on every computer using Unicode. Multi-Byte Character Sets (MBCS) are code pages, or maps, between written characters and either one byte or two byte numbers. As a result, a string with several MBCS characters can have both single and double-byte characters. Unlike Unicode, one number is not unique to one character. When using MBCS, a given number might correspond to a Chinese character when using a Chinese character set, or it might correspond to a Japanese character when using a Japanese character set. Another difference between MBCS and Unicode is that different MBCS platforms (such as Unix, Microsoft Windows, or Macintosh) may have different character sets for the same written language. For example, when using MBCS a given number might map to a Japanese character on Japanese Windows, but the same number may map to another character on English Windows.
Another consideration in supporting international characters is that different operating systems support different character encoding methods. For example, Windows NT fully supports both Unicode and MBCS, but Windows95 fully supports MBCS, and only partially supports Unicode.To maximize the efficiency with which WHIP! data is used, and to ensure the smallest possible file size, the strategy for WHIP! data is:
Text strings containing only ASCII characters are stored in WHIP! data as ASCII. By using one byte instead of two, file size remains small. Text strings containing multi-byte characters are stored in WHIP! data as Unicode. This is the most flexible and universal approach for ensuring that WHIP! data works now and in the future. This also enables you to use WHIP! data on non-Microsoft operating systems.