Technical Blog: October 2012

Oct 8, 2012

Oracle GLOBALIZATION SUPPORT 2

Oracle GLOBALIZATION SUPPORT 1

N is used as escape character for national character set

In 9.2, Oracle has to convert your entire SQL statement to the database character set before executing it. In 10.2, you can set ORA_NCHAR_LITERAL_REPLACE to TRUE which would avoid converting N' escaped literals to the database character set,

If you are using 10g and you care about Chinese you should seriously consider using AL32UTF8.

AL32UTF8 contains a large number of additional Chinese characters. The most important ones are a slew of Hong Kong specific characters that are used most frequently in names. All HK systems for HK government are required to support these characters.

But if you do this, be aware that Dev 6i does not support AL32UTF8.

The chinese is a MULTIBYTE character and only UTF8 can handle this.

When creating the database using the UTF8 character set, the Chinese character can be stored in and extracted from the varchar2 data type column. (NCHAR is AL16UTF16).

SELECT * FROM NLS_DATABASE_PARAMETERS WHERE PARAMETER LIKE '%CHARACTERSET';

PARAMETER VALUE

------------------------------ ----------------------------------------

NLS_CHARACTERSET AL32UTF8

NLS_NCHAR_CHARACTERSET AL16UTF16

UTF8, ALT32UTF8 would be capable of storing Japanese symbols, WE8ISO8859P1 wouldn't

Any new system being built for anything more important than a school project should beUTF-8.

Use lengthb in these cases.

The 'N' Variant

So, of what use are the NVARCHAR2 and NCHAR (for completeness)? They are used in systems where the need to manage and store multiple character sets arises. This typically happens in a database where the predominant character set is a single-byte fixed-width one (such as WE8ISO8859P1), but the need arises to maintain and store some multibyte data. There are many systems that have legacy data but need to support multibyte data for some new applications, or systems that want the efficiency of a single-byte character set for most operations (string operations on a fixed-width string are more efficient than on a string where each character may store a different number of bytes), but need the flexibility of multibyte data at some points.

The NVARCHAR2 and NCHAR datatypes support this need. They are generally the same as their VARCHAR2 and CHAR counterparts, with the following exceptions:

* Their text is stored and managed in the databases national character set, not the default character set.

* Their lengths are always provided in characters, whereas a CHAR/VARCHAR2 may specify either bytes or characters.

In Oracle9i and above, the database¿s national character set may take one of two values: UTF8 or AL16UTF16 (UTF-16 in 9i; AL16UTF16 in 10g). This makes the NCHAR and NVARCHAR types suitable for storing only multibyte data, which is a change from earlier releases of the database (Oracle8i and earlier allowed you to choose any character set for the national character set).

select dump('xo'),ascii('x'),ascii('o') from dual;

select unistr('\8349') from dual;

NLS_LENGTH_SEMANTICS = BYTES

All the tables/plsql package are defined with the default byte semantics for e.g. Customer_Name VARCHAR2(80). Now the issue is because of the new incoming chinese characters it throws error "too large value" errors (though they are less then 80 charactes in case of Customer_Name

To resolve this change the COLUMN definition from VARCHAR2(80) to VARCHAR2(80 CHAR)

Some more Details -----------------

Technical Blog

Pages

Oct 8, 2012

Oracle GLOBALIZATION SUPPORT 2

Explore ORACLE4U

Labels

Translate

Wikipedia