Saturday, February 15, 2014

Windows-1251 (also the name used Win1251, CP1251) - encoding is a standard 8-bit encoding for all U


This article give me a reason will attempt to justify the feasibility of switching to UTF-8 in the implementation of electronic projects. Such questions as "character encoding" will transfer give me a reason from the plane to the plane of ideological practice. Because some programmers flatly refuse to use the "bourgeois" character table UTF, and preferred Windows-1251 give me a reason or KOI8-R, explaining the reasons and claiming they are ours, people. Let me remind you of key competitors:
Windows-1251 (also the name used Win1251, CP1251) - encoding is a standard 8-bit encoding for all Ukrainian and Russian localized versions of Microsoft Windows. Enjoys quite popular. Was established give me a reason on the basis of character sets that were used in the early "-yourself" crack Windows in 1990-1991 jointly by the "Section", "Dialogue" and the Russian give me a reason branch of Microsoft. The initial version of the coding markedly different from that of today, prevedenoho give me a reason in the table below (in particular, there was a significant number of "white spots").
Windows-1251 compares favorably with other Cyrillic encodings practically all the symbols used in the Slavic Cyrillic typography for normal text (not just badge accent), it contains all characters for Russian, Ukrainian, Belarusian, Serbian and Bulgarian languages.
It has three disadvantages: small (lowercase) letter "I" is code 0xFF (255 in 10 chniy system). It is the "culprit" a number of unexpected problems in the programs without give me a reason the support of a clean 8th bit. no pseudo-characters. when sorting alphabetically give me a reason consecutive letters are not as letters between the main unit and ўЎiIyeYeYihHёЁ letters go special characters.
The lower part of the table encoding (Latin) complies encoding ASCII. The numbers under the letters denoting the 16 cal code letters in Unicode. Koi-8 (code sharing information, 8 bits), n. KOI-8 - vosmybitova ASCII-compatible code page, designed to encode letters Cyrillic give me a reason alphabets.
There are several options koi-8 encoding for various Cyrillic alphabets. The Russian alphabet is described in the encoding KOI8-R, Ukrainian - in KOI8-U. KOI8-R give me a reason has become the de facto standard for the Russian Cyrillic script in Unix-Like operating systems and email.
In some countries, SEV was created modifications koi-8 for national variants Latin. The basic idea was the same - despite the fact that "cuts" the eighth bit of the text had to remain more or less clear. give me a reason For example, in the Czech version of koi-8 characters give me a reason Čč should turn on cC, Žž - in zZ more. Currently these encodings are not used.
Unicode (born Unicode) give me a reason - is an industry standard designed to make it possible for texts and symbols (graphic symbols) written all of the world agreed representation give me a reason (representation) and processing by computers. give me a reason Improved compatible with standard give me a reason Universal Character Set (Universal Character give me a reason Set - UCS) and published in book form Unicode Standard, Unicode consists of a range of characters, encoding methodology and set (set) standard character encoding set code tables for links to images of characters, a list of properties character such as, eg, upper and lower case, a set of reference data computer files, rules of normalization, decomposition, mapping and images (rendering). The standard proposed in 1991 by the Unicode Consortium organization (born Unicode Consortium), which brings together the largest IT companies (corporations). Unicode Consortium - a non-profit (non-profit) organization that coordinates the development of Unicode is an ambitious goal of eventually replacing existing character encoding of Unicode and its system of standards Transformation Format of Unicode (UTF, Unicode Transformation Format), because many existing coding systems are limited in size and capabilities and incompatible with multilingual environments. Advances give me a reason in the Unicode character give me a reason set unification led to its spread and dominant give me a reason use in the internationalization and localization of computer software. The standard has been used in many new technologies, including XML, Java programming give me a reason language and modern operating systems. Crosses the Unicode character encoding restriction old one byte. Instead of using 17 spaces, each of which defines 65,536 codes and makes it possible to describe the maximum of 1 114 112 (17 * 216) different characters. Basic Multilingual Plane (BMP) - Basic Multilingual give me a reason Plane, contains almost all of the characters you'll ever use. Unicode has several implementations, but the most common are two: UTF (Unicode Transformation Format) - Transformation Format of Unicode give me a reason and UCS (Universal Character Set) - Universal Character. The number after UTF specifies the number of bits allocated at each unit, and the number after UCS determines the number of bytes. Universal Character Set

No comments:

Post a Comment