Technical Blog

Feb 12, 2013

NOSQL

What is necessary for top-tier Web sites, according to proponents of NoSQL, is massive scalability, low latency, the ability to grow the capacity of your database on demand and an easier programming model. These, and others, are things which, according to them, SQL RDBMSes just don't provide in a cost-effective manner.

NoSQL database systems are developed to manage large volumes of data that do not necessarily follow a fixed schema. Data is partitioned among different machines (for performance reasons and size limitations) so JOIN operations are not usable.Documents are addressed in the database via a unique key that represents that document.No SQl Suuports CAP Property (Consistency, Availability, Partition tolerance)

C - Consistency
----------------
A - Availability
----------------
P - Partition tolerance
--------------------
Partial partition tolerance for these databases is obtained by mirroring database clusters between multiple data centers. The advantage these databases have over a traditional RDBMS is that with the work spread over all of those machines, you can achieve ultra-low latency even when there are extremely high numbers of reads and writes, and with all those machines, you can analyze massive amounts of data quickly.This is meaningless when the database is on a single server

NoSQL implementations can be categorized by their manner of implementation:

1) Document store

Compared to relational databases, for example, collections could be considered as tables as well as documents could be considered as records.But they are different: every record in a table has the same sequence of fields, while documents in a collection could have fields that are completely different.Documents are addressed in the database via a unique key that represents that document. One of the other defining characteristics of a document-oriented database is that, beyond the simple key-document (or key-value) lookup that you can use to retrieve a document, the database will offer an API or query language that will allow you to retrieve documents based on their contents.
Having keys and values are the so-called document databases. A document, in this case, is a collection of various fields of information. Each individual document can have a different number of fields of varying lengths. These databases are useful if you have a lot of semi-structured data, and they are a good fit for object-oriented programming models

Eg :CouchDB, MongoDB

2) Graph

This kind of database is designed for data whose relations are well represented as a graph (elements interconnected with an undetermined number of relations between them). The kind of data could be social relations, public transport links, road maps or network topologies

Eg: Neo4j, InfoGrid and HyperGraphDB

3) Key-value store

Key-value stores allow the application to store its data in a schema-less way. The data could be stored in a datatype of a programming language or an object.Each piece of data that goes into the database is given a key, and when you want the data back, you use the key to get

4) XML databases

It does not use SQL as its query language.
It may not give full ACID guarantees.
It has a distributed, fault-tolerant architecture

5) Wide-column databases

It tend to draw inspiration from Google's BigTable model.

Eg: Cassandra, HBase

Advantages of NoSQL
-------------------------
Elastic scaling

Storage Type: Column Based NoSQL Databases, Document Based NoSQL Databases, Graph Based NoSQL Databases, Key-Value Based NoSQL Databases

License Type: AGPL NoSQL Databases, Apache NoSQL Databases, BSD NoSQL Databases, GPL NoSQL Databases, Open Source NoSQL Databases, Proprietary NoSQL Databases

Implementation Language: C NoSQL Databases, C++ NoSQL Databases, Erlang NoSQL Databases, Java NoSQL Databases, Python NoSQL Databases

Data Storage: BDB NoSQL Databases, Disk NoSQL Databases, GFS NoSQL Databases, Hadoop NoSQL Databases, Plug-in NoSQL Databases, RAM NoSQL Databases, S3 NoSQL Databases

Big Data

Big Data is all about finding a needle of value in a haystack of unstructured information.Big data refers to large datasets that are challenging to store, search, share, visualize, and analyze.

BigTable is a compressed, high performance, and proprietary data storage system built on Google File System .Big Table is Googles Database used for Google Reader,Google Maps,Google Book Search,My Search History,Google Earth, Blogger.com,Google Code hosting, Orkut,YouTube and Gmail.

BigTable maps two arbitrary string values (row key and column key) and timestamp (hence three dimensional mapping) into an associated arbitrary byte array. It is not a relational database

Relational databases (such as Oracle, MySQL, and SQL Server) versus newer non-relational databases (such as MongoDB, CouchDB, BigTable, and others).

Big Data is not just about volume, the approach to analysis contends with data content and structure that cannot be anticipated or predicted.

Feb 6, 2013

Connection Pool

A connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database .

connection pooling is generally the practice of a middle tier (application server)

getting N connections to a database (say 20 connections).

These connections are stored in a pool in the middle tier, an "array" .Each connection is set to "not in use"
When a user submits a web page to the application server, it runs a piece of your code,
your code says "i need to get to the database", instead of connecting right there and
then (that takes time), it just goes to this pool and says "give me a connection please".
the connect pool software marks the connection as "in use" and gives it to you.

Jan 17, 2013

Modify Datatype

Column having number(5) and Char(5) datatype for a particular table cant reduce its size even if that column contains data with length 1 .

But can reduce the size of varchar/varchar2 even if the table is not empty .

Column having number(5) ,varchar2(5) and Char(5) datatype for a particular table can increase size.

Oct 8, 2012

Oracle GLOBALIZATION SUPPORT 2

Oracle GLOBALIZATION SUPPORT 1

N is used as escape character for national character set

In 9.2, Oracle has to convert your entire SQL statement to the database character set before executing it. In 10.2, you can set ORA_NCHAR_LITERAL_REPLACE to TRUE which would avoid converting N' escaped literals to the database character set,

If you are using 10g and you care about Chinese you should seriously consider using AL32UTF8.

AL32UTF8 contains a large number of additional Chinese characters. The most important ones are a slew of Hong Kong specific characters that are used most frequently in names. All HK systems for HK government are required to support these characters.

But if you do this, be aware that Dev 6i does not support AL32UTF8.

The chinese is a MULTIBYTE character and only UTF8 can handle this.

When creating the database using the UTF8 character set, the Chinese character can be stored in and extracted from the varchar2 data type column. (NCHAR is AL16UTF16).

SELECT * FROM NLS_DATABASE_PARAMETERS WHERE PARAMETER LIKE '%CHARACTERSET';

PARAMETER VALUE

------------------------------ ----------------------------------------

NLS_CHARACTERSET AL32UTF8

NLS_NCHAR_CHARACTERSET AL16UTF16

UTF8, ALT32UTF8 would be capable of storing Japanese symbols, WE8ISO8859P1 wouldn't

Any new system being built for anything more important than a school project should beUTF-8.

Use lengthb in these cases.

The 'N' Variant

So, of what use are the NVARCHAR2 and NCHAR (for completeness)? They are used in systems where the need to manage and store multiple character sets arises. This typically happens in a database where the predominant character set is a single-byte fixed-width one (such as WE8ISO8859P1), but the need arises to maintain and store some multibyte data. There are many systems that have legacy data but need to support multibyte data for some new applications, or systems that want the efficiency of a single-byte character set for most operations (string operations on a fixed-width string are more efficient than on a string where each character may store a different number of bytes), but need the flexibility of multibyte data at some points.

The NVARCHAR2 and NCHAR datatypes support this need. They are generally the same as their VARCHAR2 and CHAR counterparts, with the following exceptions:

* Their text is stored and managed in the databases national character set, not the default character set.

* Their lengths are always provided in characters, whereas a CHAR/VARCHAR2 may specify either bytes or characters.

In Oracle9i and above, the database¿s national character set may take one of two values: UTF8 or AL16UTF16 (UTF-16 in 9i; AL16UTF16 in 10g). This makes the NCHAR and NVARCHAR types suitable for storing only multibyte data, which is a change from earlier releases of the database (Oracle8i and earlier allowed you to choose any character set for the national character set).

select dump('xo'),ascii('x'),ascii('o') from dual;

select unistr('\8349') from dual;

NLS_LENGTH_SEMANTICS = BYTES

All the tables/plsql package are defined with the default byte semantics for e.g. Customer_Name VARCHAR2(80). Now the issue is because of the new incoming chinese characters it throws error "too large value" errors (though they are less then 80 charactes in case of Customer_Name

To resolve this change the COLUMN definition from VARCHAR2(80) to VARCHAR2(80 CHAR)

Some more Details -----------------

Technical Blog

Pages

Feb 12, 2013

NOSQL

Big Data

Feb 6, 2013

Connection Pool

Jan 17, 2013

Modify Datatype

Oct 8, 2012

Oracle GLOBALIZATION SUPPORT 2

Explore ORACLE4U

Labels

Translate

Wikipedia