As per Relevance of the word encoding, we have this rfc below:











Network Working Group Vietnamese Standardization Working
Request for Comments: 1456 May 1993


Conventions for Encoding the Vietnamese
VISCII: VIetnamese Standard Code for Information
VIQR: VIetnamese Quoted-Readable
Revision 1.1

Status of this

This memo provides information for the Internet community. It
not specify an Internet standard. Distribution of this memo
unlimited



This document provides information to the Internet community on
currently used conventions for encoding Vietnamese characters
7-bit US ASCII and in an 8-bit form. These conventions are
used by the overseas Vietnamese who are on the Internet and
active in USENET. This document only provides information
specifies no level of standard

1.

In this paper we describe two conventions for representing
characters. VISCII (pronounced "visky") is an 8-bit
encoding that is similar to that used with ISO-8859.
(pronounced "vicker") is a mnemonic encoding of Vietnamese
into US ASCII for use on 7-bit systems. There is
existing online freely distributable software that implements
conventions for UNIX and personal computers. These encodings
Vietnamese-language users to take full advantage of powerful
already developed for the English-speaking world,
unnecessary reinvention. This paper describes these conventions
part so that MIME-compliant software might also support
Vietnamese language

NOTE: The accented Vietnamese letters are herein represented by
VIQR equivalents, offset by enclosing angle brackets. For example
the single letter "a acute" is written as , where the
is the mnemonic symbol for the acute

2. LINGUISTIC

As a romanized language, Vietnamese appears to lend itself readily
integration into existing English-based systems. To cite a



Vietnamese Standardization Working Group [Page 1]

RFC 1456 Conventions for Encoding Vietnamese May 1993


example, consider implementing support for French in such systems
One can allocate code positions in the 8-bit space necessary
accented letters such as or , then provide a means for
to access these codes through the keyboard. The required number
"extra" code positions is small (see, e.g., ISO-8859/Latin-1 [1]),
and the relatively low frequency of occurrence of accented
does not place heavy demand on efficient keyboard input schemes.
same things cannot be said for Vietnamese, where both the number
occurrence frequency of accented letters are large. Apart from
alphabetics already available in ASCII, Vietnamese requires
additional 134 combinations of a letter and diacritical symbols

Note that one can resort to a composite encoding scheme to
this requirement, but that would mean giving up on integration
today's computing platforms which for the most part do not
such schemes. In addition, the heavy use of diacritical marks
Vietnamese text calls for a keyboard input scheme that does
require extra keystrokes such as a special "compose" key to
accented letters. Because of the large number of
combinations, the scheme should also be easily learned and memorized

Finally, to integrate Vietnamese into current electronic mail
which are still limited to 7 bits, there should be a
for Vietnamese text that is readily readable in its 7-bit form

The Viet-Std group, an electronic standardization roundtable,
worked over the past few years to draft proposals addressing
issues. This has culminated in the conventions to be
briefly in the next two sections. The detailed
considerations have been reported elsewhere [2]. In this memo
give a brief outline of the working standards and describe
software availability

3. SPECIFICATION OF

VISCII stands for VIetnamese Standard Code for
Interchange, an 8-bit encoding specification. Its salient
are

1. Encoding of all Vietnamese letters as single
rather than separating base vowels and
marks

2. Retention of the complete ASCII graphics
in order to facilitate integration

3. Encoding the 6 least-often-used upper-case letters
6 least problematic C0 (control) characters



Vietnamese Standardization Working Group [Page 2]

RFC 1456 Conventions for Encoding Vietnamese May 1993


4. Character placement have been designed
consideration for Unix/X integration, ISO-8859/Latin-1
compatibility, coexistence with a wide array
existing software, including provisions for single
and double-line drawing characters in the IBM
character set

The 8-bit VISCII encoding is shown below. Because of the
of the 7-bit US ASCII character set, here we use the mnemonic form
represent Vietnamese glyphs. See the VIQR specification below
clarification of how diacritical marks are applied. The
PostScript version of reference [2] may also be useful as it
display each character correctly

Table 1. VISCII 8-bit Encoding Table (v1.1)
*=======================================================================*
| | 0x 1x 2x 3x 4x 5x 6x 7x | 8x 9x Ax Bx Cx Dx Ex Fx |
|====|==================================================================|
| x0 | nul dle sp 0 @ P ` p | A. O^` O~ o^` A` DD a` dd |
| x1 | soh dc1 ! 1 A Q a q | A(' O^? a(' o^? A' u+' a' u+. |
| x2 | A(? dc2 " 2 B R b r | A(` O^~ a(` o^~ A^ O` a^ o` |
| x3 | etx dc3 # 3 C S c s | A(. O^. a(. O+~ A~ O' a~ o' |
| x4 | eot Y? $ 4 D T d t | A^' O+. a^' O+ A? O^ a? o^ |
| x5 | A(~ nak % 5 E U e u | A^` O+' a^` o^. A( a. a( o~ |
| x6 | A^~ syn & 6 F V f v | A^? O+` a^? o+` a(? y? u+~ o? |
| x7 | bel etb ' 7 G W g w | A^. O+? a^. o+? a(~ u+` a^~ o. |
| x8 | bs can ( 8 H X h x | E~ I. e~ i. E` u+? e` u. |
| x9 | ht Y~ ) 9 I Y i y | E. O? e. U+. E' U` e' u` |
| xA | lf sub * : J Z j z | E^' O. e^' U+' E^ U' e^ u' |
| xB | vt esc + ; K [ k { | E^` I? e^` U+` E? y~ e? u~ |
| xC | ff fs , < L \ l | | E^? U? e^? U+? I` y. i` u? |
| xD | cr gs - = M ] m } | E^~ U~ e^~ o+ I' Y' i' y' |
| xE | so Y. . > N ^ n ~ | E^. U. e^. o+' I~ o+~ i~ o+. |
| xF | si us / ? O _ o DEL| O^' Y` o^' U+ y` u+ i? U+~ |
*=======================================================================*

4. SPECIFICATION OF VIQR

VIQR, VIetnamese Quoted-Readable specification, is not an
convention but is rather a convention for typing, reading,
transferring Vietnamese data using only the 7-bit ASCII
set. With VIQR, accented Vietnamese letters are represented by
vowel followed by ASCII characters whose appearances resemble
of the corresponding Vietnamese diacritical marks. For example,
phrase "Nc Vit Nam" is represented in 7-bits
"Nu+o+'c Vie^.t Nam". The complete list of diacritical
equivalents is given in Table 2. There is also provision in the
specification to prevent undesirable composition, for example,



Vietnamese Standardization Working Group [Page 3]

RFC 1456 Conventions for Encoding Vietnamese May 1993


avoid getting "How are you?" composed into "How are yo".
details, please see [2]. VIQR therefore serves the
purposes

1. It provides for a mnemonic, readable representation
Vietnamese in 7-bit form, which makes it easy
transfer Vietnamese electronic mail without
conversion. The originator and recipient
communicate in Vietnamese without the need for
8-bit environment at any point in the data chain

2. It provides a bridge for translation between 7- and 8-
environments. In this context, typing in both 7-
and 8-bit systems requires exactly the same keystrokes
the only difference is that the 8-bit user gets to
actual Vietnamese on-screen, whereas the 7-bit
sees a mnemonic representation thereof. The
options are available for the 7-bit and 8-bit
of Vietnamese text

Because of its mnemonic nature, the VIQR typing method is easy
learn and remember. In pure 8-bit environments, special-
software developers may wish to devise more efficient input schemes
but the intent is for all Vietnamese keyboard software to support
basic VIQR method to minimize learning time for Vietnamese who
already be familiar with the mnemonic method described here

Table 2. VIQR Mnemonics for Vietnamese
*=====================================================*
| Diacritic | Char | ASCII Code | Du |
|=====================================================|
| breve | ( | 0x28, left paren | trng |
| circumflex | ^ | 0x5E, caret | m |
| horn | + | 0x2B, plus sign | mc |
|-------------+------+--------------------+-----------|
| acute | ' | 0x27, apostrophe | sc |
| grave | ` | 0x60, backquote | huyn |
| hook above | ? | 0x3F, question | hi |
| tilde | ~ | 0x7E, tilde | ng |
| dot below | . | 0x2E, period | nng |
|-------------+------+--------------------+-----------|
| d bar | dd | (repeated d) |
|
| D bar | DD | (repeated D) |
|
*=====================================================*







Vietnamese Standardization Working Group [Page 4]

RFC 1456 Conventions for Encoding Vietnamese May 1993


5. SUPPORTING

VISCII & VIQR have been successfully implemented on
platforms. The work has been carried out primarily by the
software group, a non-profit spin-off from Viet-Std. Software
other individuals and groups have also been developed. In addition
commercial software entities have indicated that they would
the standards in the form of VISCII-compliant keyboards and fonts

The current software selection from the TriChlor group enables
to use Vietnamese on existing Unix, MS-DOS, and Windows systems
including such operations as Vietnamese file naming,
keyboarding within any application, electronic mail and news
for Unix, printing to various printer languages,
Vietnamese in such document preparation systems as TeX, Word
Windows, WordPerfect, using Vietnamese in databases (e.g., Paradox
and spreadsheets (e.g., SC on Unix or Excel in Windows).
Vietnamese-specific applications are also available and include
large song lyric database, several poetry collections in
format, a Windows-based fortune teller, a text-based multiple-
test program in Vietnamese, etc. In short, software exists
supports thorough integration of Vietnamese into existing platforms
allowing Vietnamese users to take advantage of all the powerful
already available in English-only environments

Translation between 8-bit VISCII 1.1 and other character sets
particularly ISO-10646/Unicode 1.1, has been included in the Plan 9
operating systems' tcs utility that has been made available by
Hume of AT&T Bell Laboratories

6. MIME

For use with MIME-compliant software, the value "VISCII" has
registered as a charset with the Internet Assigned Numbers
for the VISCII encoding convention described above, and the
"VIQR" has been registered with the Internet Assigned
Authority as a charset for the VIQR mnemonic encoding
described above. Implementation of support for these two
character set types is not mandatory to comply with RFC-1341. If
encoding conventions described above are used in MIME email or news
the appropriate MIME character set type value should be used to
the body-part containing such text

7. SECURITY

Security issues are not discussed in this memo





Vietnamese Standardization Working Group [Page 5]

RFC 1456 Conventions for Encoding Vietnamese May 1993




[1] International Organization for Standardization. ISO 8859/x: 8-
bit International Code Sets. ISO, 1977.

[2] Viet-Std, "A Unified Framework for Vietnamese
Processing-v1.1," published on the Internet, available for
from Sonygate.Sony.COM:tin/viet-std, September 1992.











































Vietnamese Standardization Working Group [Page 6]

RFC 1456 Conventions for Encoding Vietnamese May 1993


AUTHORS'

Cuong T.
Center for Integrated
CIS 062--MC 4070
Stanford, CA 94305-4070

Phone: (415) 725-3721
Email: cuong@haydn.Stanford.


Hoc D.
Vista Research, Inc
100 View St, Suite 200
P.O. Box 998
Mountain View, CA 94042

Phone: (415) 966-1171
Email: uunet!vri280!


Cuong M.
National Semiconductor Corp
3388 Burgundy Dr
San Jose, CA 95132

Phone: (408) 721-6873
Email: bui@berlioz.nsc.


Thanh van
Roche Image Analysis
95 First Str Suite 110
Los Altos, CA 94022

Phone: 415-917-2022
Fax: 415-917-2025
Email: thanh@rias.

For more information, please contact the authors at
viet-std@haydn.stanford.










Vietnamese Standardization Working Group [Page 7]







if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum