Dec 31, 2013

Buffer Aging , LRU Lists,Dirty List & DBWR

All Oracle data is obtained by users from the buffer cache.The basic purpose of the cache is to minimize physical disk I/O by holding (buffering) copies of requested data blocks in memory.

Buffer Aging and LRU Lists

    An Oracle database typically has many more data blocks on disk than memory buffers in the buffer cache. Since not all blocks can be buffered in the cache at once, new block requests (cache misses) must find room in the cache to be read in from disk. When this happens, another block in the cache is usually discarded since the cache is normally full (and fixed in size).

 The buffer cache is carefully designed to favor keeping frequently requested blocks in memory and allow less popular blocks to be replaced by new block requests.
These decisions about which blocks to replace are made using a least recently used (or LRU) algorithm. This algorithm uses a data structure called the LRU list. This list basically orders the buffers in the cache according to when they were last accessed by a user.

 When a block is accessed, it is moved to the MRU (most recently used) end of the list. Blocks in the cache that are not accessed for awhile will find more and more blocks ahead of them in the list, and they will be closer to the LRU end of the list. This is also known as buffer aging in the LRU list.

Buffers are replaced in the cache from the least recently used end of the LRU list. This helps insure that frequently accessed buffers are not discarded, as they are regularly moved to the MRU end of the list with each access.

This mechanism of keeping the most requested blocks in the buffer cache is normally very effective at minimizing disk I/O. Managing a single LRU list can sometimes be a bottleneck in a heavily loaded database

DBWR and the Dirty List

In addition to the LRU list Oracle keeps a list of buffers that contain data that has been modified by users. This list is called the dirty list. Changed data committed by users must eventually be written to disk, as this is the permanent storage of the database.

 The DBWR background process is responsible for moving blocks from the dirty list to their permanent locations in disk files.

Dirty blocks cannot be replaced in the cache until they have been written to disk, otherwise the changes would be lost. An overabundance of dirty buffers can negatively impact cache efficiency by reducing available slots for new blocks.

This can happen when DBWR is unable to keep up with the volume of update activity.

 Multiple DBWR processes can be configured in this case to increase the capacity to write out dirty blocks.

Buffer Cache Problems 

The LRU algorithm of the Oracle buffer cache is normally very good at providing efficient minimization of physical disk I/O. However, there are some situations where normal buffer cache aging may not be the best option for overall performance, for instance:

•Blocks that should not go to the MRU end of the list
•Blocks that should be excluded from aging and stay in the cache .

The first situation can occur when very large tables are accessed randomly by users with very little block use overlap between users. In this case, the MRU end of the list is flooded by blocks that will not result in subsequent cache hits yet age other blocks down to the LRU end of the list.

These other blocks may be replaced when they could have resulted in cache hits had they been kept. The second situation occurs when there are data blocks that will definitely be requested regularly and we want to insure that physical disk I/O is not necessary to obtain their data.
An example might be small lookup tables, or perhaps specific indexes.

The multiple buffer pool feature allows greater control over buffer cache usage to help address these problems and obtain even better cache performance.

PGA & SGA

PGA stands for Process Global Area which is also known as Program Global Area. It is called global area because is keeps information which is required by all modules of Oracle Code. PGA keeps information specific to the sever process upon which Oracle code acts.

PGA also keeps information about oracle shared resources so that it can free those resources .
Each session contains specific information like bind variables and runtime structures in a private SQL area.

Whenever a session executes a statement, a private SQL area is assigned to that session.
Even if multiple users are issuing the same statement using the same shared SQL area, each session will have its own dedicated private SQL area.

A private SQL area contains data such as bind information and runtime memory structures.
Each user that submits the same SQL statement has his or her own private SQL area that uses a single shared SQL area.

Thus, many private SQL areas can be associated with the same shared SQL area. A private SQL area itself is divided into run-time area and persistent area.

Persistent area contains information like bind variable and will be freed once the cursor is closed.
Run-time area is allocated with the first step of the execute request and will be freed when execution is completed.

SGA
Each instance will have its own SGA .The SGA contains data buffer areas, redo log buffers and the shared pool .
Each area is important to the database overall performance.

The Oracle kernel process uses an LRU (Least Recently Used) algorithm to write data back to the disks.
Data is never altered on the disks directly, but is altered in memory first.

The shared SQL pool is used to store the Dictionary Cache as well as information about SQL statements that are being run against the database.

Dec 20, 2013

Row chaining & Migration

ANALYZE TABLE {BAD_TABLE} LIST CHAINED ROWS INTO {MY_CHAINING_TABLE};



PCTFREE , PCTUSED ,INITRANS & MAXTRANS

       The datafiles are the physical storage space on the server. Storage in the datafiles is allocated by blocks to each extent that is used by an object. The block size is operating system-dependent and is determined when the database is created and cannot be changed. The database block size is set in the database parameter file DB_BLOCK_SIZE. Typically, a database block is either 2K or 4K.

•The PCTFREE parameter specifies the percentage of space in each segment's data block reserved for future record expansion into the block.

If records within a block are rarely updated after they are inserted, you should set PCTFREE lower to allow for full space usage. If records within a block are subject to many updates, you should set PCTFREE higher to allow for more growth.

Once the PCTFREE threshold is reached, no additional rows will be inserted until the PCTUSED threshold is reached due to deletion of data. If PCTFREE is set too low, row chaining and row migration will result because updates to the record will not fit into the block.
After that point, no new rows can be inserted into that block until the free space percentage falls below the PCTUSED threshold. The default for PCTFREE is 10%.

The PCTFREE and PCTUSED parameters tell Oracle when to link and unlink a block from the freelist chain. This is only relevant if you are not using the new Automatic Segment Space Management .

•The PCTUSED parameter defines the minimum percentage of data block used space that is necessary before the block is eligible for row insertion. A segment block is added to the free space list once its used space falls below this threshold. The default for PCTUSED is 40%.

If data is static, you will be able to set the PCTUSED lower and more fully use the space. If large amounts of data are inserted and deleted, you should set the PCTUSED higher to prevent block fragmentation.

You might set these parameters as follows: CREATE table emp (......) PCTFREE 5 PCTUSED 80.

The combined sum of PCTFREE and PCTUSED must be less than 100. Correct choice of these parameters can be used to improve the efficiency of table and index segments.

For example, tables that are insert-only (auditing tables, history tables, etc.) should have a PCTFREE setting of 1.

This simple change to a table's storage definition can reduce its total disk space requirements by 10%.
This means less object extension and less physical disk I/O when reading and writing: 10% more data can be read or written with a single data block read; 10% more data can be held in the database's SGA.
This is a very simple way of increasing your block buffer cache without having to buy any more memory.

INITRANS is the initial number of concurrent transactions allocated in each block header when the block is allocated. The default is 1, which is the minimum. The maximum is 255. The size of each transaction entry is operating system-dependent.

MAXTRANS is the maximum number of concurrent transactions for the blocks within the tablespace. The default is 255, which is the maximum. The minimum setting is 1.

Oracle Managed Datafiles (OMFs)

Oracle introduced Oracle Managed Files (OMF) concept in Oracle 9i. OMFs simplify the administration task of an Oracle Database. Oracle manages these files for the administrators.
Prior to Oracle9i, when you dropped a tablespace you would also have to manually remove the physical datafile associated with that tablespace from the operating system.

If a wrong datafile is removed manually from the Operating System, then the option remaining was to do the recovery of the file from the recent backup, which means either a bigger downtime or the tablespace containing the removed datafile was unavailable for the users.
This manual task of removing the datafile is taken over by Oracle. Hence, with Oracle-Managed Datafiles physical file management is left to the database itself.

Oracle-Managed Datafiles (OMFs) give Oracle the ability to manage database files for you.The database internally uses standard file system interfaces to create and delete files as needed.

OMFs can be used when creating database datafiles, tempfiles, online redo logfiles, and database control files.
Before using OMFs, the database must be configured for OMF use. After the database is configured

If the database creates an Oracle Managed Control File, and the database uses SPFILE, then the control_file parameter is automatically added to the SPFILE with the location of the control file.

If PFILE is used instead of SPFILE, then the control_file parameter needs to be added manually else the instance startup will fail.

May 29, 2013

Cursors

         Whenever a SQL statement is executed, Oracle automatically allocates a memory area (known as context area) in Oracle database PGA i.e. Process Global Area. This allocated memory space is the query work area which holds the query related information.This is know as cursors

OPEN stage

•PGA memory allocation for cursor processing (OPEN Cursor)
•Parsing of SELECT statement (Parse SQL)
•Variable binding (Bind SQL)
•SELECT Query execution (Execute Query)
•Move the record pointer to the first record

FETCH stage

The record, to which the record pointer points, is pulled from the result set. The record pointer moves only in the forward direction. The FETCH phase lives until the last record is reached.

CLOSE stage

After the last record of the result set is reached, cursor is closed, and allocated memory is flushed off and released back to SGA. Even if an open cursor is not closed, oracle automatically closes it after the execution of its parent block


Cursor FOR loops

Cursor FOR loops improvise upon the performance and code interactivity by their implicit actions

Parameterized Cursors

Parameterized cursors enables programmer to pass parameter to the cursors

Dynamic Cursor FOR Loops

FOR I IN (SELECT EMPLOYEE_NAME, JOB_ID,SALARY FROM EMPLOYEES WHERE DEPARTMENT_ID=10)

FOR UPDATE OF clause is used to lock a set of rows in a session. This concept can be used in explicit cursors to impose exclusive row level lock on all the rows contained by the cursor query result set. These rows will remain locked until the session issues ROLLBACK or COMMIT.

Oracle provides WHERE CURRENT OF clause to update or delete the rows which are locked by the FOR UPDATE OF cursor in the session

May 7, 2013

Oracle Collection - Varrays


Varrays hold a fixed number of elements, although the number of elements can be changed at runtime. Like nested tables, varrays use sequential numbers as the index or key to the elements. You can define equivalent SQL types, allowing varrays to be stored in database tables. Varrays are a good choice when the number of elements is known in advance, and when the elements are likely to be accessed in sequence.


      A VARRAY is an array of varying size. It has an ordered set of data elements, and all the elements are of the same data type. The number of elements in a VARRAY is the "size" of the VARRAY. You must specify a maximum size (but not a minimum size) when you declare the VARRAY type.

         In general, the VARRAY type should be used when the number of items to be stored is small; it is not suitable for large numbers of items or elements. Note that you cannot index or constrain VARRAY values. Varray is available in PL/SQL as well as in SQL

Arrays must be dense (have consecutive subscripts). So, you cannot delete individual elements from an array


  • Use to preserve ordered list
  • Use when working with a fixed set, with a known number of entries
  • Use when you need to store in the database and operate on the Collection as a whole

  •   Varrays in SQL

    CREATE OR REPLACE TYPE varray_type AS VARRAY(SIZE) OF element_type;

     Varrays in PLSQL

    TYPE varray_type IS VARRAY(SIZE) OF element_type (datatype) ;

    varray_name varray_type; (variable type in PL/SQL)

    varray_name := varray_type();(create it using the constructor)


    CREATE TABLE table_name(field1 [VARCHAR2 NUMBER DATE],field2 varray_type);

    Example :
    CREATE TYPE Project AS OBJECT ( project_no NUMBER(2), title  VARCHAR2(35),cost  NUMBER(7,2));

    CREATE TYPE ProjectList AS VARRAY(50) OF Project;

    CREATE TABLE department ( dept_id  NUMBER(2), name  VARCHAR2(15), budget NUMBER(11,2), projects ProjectList);


    BEGIN
       INSERT INTO department
          VALUES(30, 'Accounting', 1205700,
             ProjectList(Project(1, 'Design New Expense Report', 3250),
                         Project(2, 'Outsource Payroll', 12350),
                         Project(3, 'Evaluate Merger Proposal', 2750),
                         Project(4, 'Audit Accounts Payable', 1425)));
    END;

    DECLARE
       new_projects ProjectList :=
          ProjectList(Project(1, 'Issue New Employee Badges', 13500),
                      Project(2, 'Develop New Patrol Plan', 1250),
                      Project(3, 'Inspect Emergency Exits', 1900),
                      Project(4, 'Upgrade Alarm System', 3350),
                      Project(5, 'Analyze Local Crime Stats', 825));
    BEGIN
       UPDATE department
          SET projects = new_projects WHERE dept_id = 60;
    END;

    ---

    May 3, 2013

    Oracle Collection - Nested tables

    Nested tables


             Nested tables hold an arbitrary number of elements and use sequential numbers as the index or key to the elements. Nested tables can be stored in database tables and manipulated through SQL. They are appropriate for data relationships that must be stored persistently. Nested tables are flexible in that arbitrary elements can be deleted, rather than just removing an element from the end. Note that the order and subscripts (keys) of nested tables are not preserved as the table is stored and retrieved in the database

       Nested tables can be stored in a database column.Nested tables are initially dense, but they can become sparse (Data does not have to be stored in consecutive rows) when elements are deleted. Nested Table is available in PL/SQL as well as in SQL.

    Nested tables are dense, but they can become sparse (have nonconsecutive subscripts). So, you can delete elements from a nested table using the built-in procedure DELETE. That might leave gaps in the index, but the built-in function NEXT lets you iterate over any series of subscripts

    CREATE OR REPLACE TYPE nestedtable_type AS TABLE OF element_type; (PLSQL Type in SQL)

    CREATE TABLE nested_table (id NUMBER, col1 nestedtable_type) NESTED TABLE col1 STORE AS col1_tab;   (Nested Table)

    TYPE nestedtable_type IS TABLE OF element_type; (PLSQL Type in PLSQL)


    Oracle Collection - PLSQL Table

    Collections


        A collection is an ordered group of elements ,of the same type.Collection have only one dimension,we can nodel multi-dimensional arrays by creating collections whose elements are also collections.

    The three type of collections are following

    1) Associative Array
    2) Nested tables
    3) Varrays

    Associative Array
    TYPE array_type IS TABLE OF element_type INDEX BY key_type;
    array_name array_type;
    array_name(index) := value;


    The “Associative Arrays” are also known as “Index-By” tables in PL/SQL.Associative arrays are single-dimensional, unbounded, sparse collections of homogeneous elements. Associative arrays can be sparse, which means not all elements between two elements need to be defined.Similar to hash table in other language
    For instance you can have an element at index -29 and one at index 12 with nothing in between. Homogeneous elements mean every element must be of the same type.
    Associative arrays are useful for small to medium sized lookup tables where the array can be constructed in memory ( only , which is destroyed after the session ends) each time a procedure is called or a package is initialized. There is no fixed limit on their size and their index values are more flexible- associative array keys can be negative and/or nonsequential, and associative arrays can use string values instead of numbers .The Associative Array is only available in PL/SQL


    type my_tab_t is table of number index by pls_integer/ binary_integer;
    type my_tab_t is table of number index by varchar2(4000);
    type my_tab_t is table of tab.value%TYPE index by tab.id%TYPE;
    Example :
    Declare
      cursor c_customer is select cust_name,cust_no,rownum from customers;
      type type_cname is table of customers% rowtype index by binary_integer;
      tab_cname type_cname;
      v_counter number:=0;


    begin

    for r_customer in c_customer loop
       tab_cname(v_counter) := r_customer.cust_name ;
       v_counter:=v_counter+1;
    end loop;


    forall i in tab_cname.first .. tab_cname.last
      dbms_output.put_line(tab_cname(i));
    end;


    PLSQL Table attributes

    DELETE - Delete rows in a table.
    EXISTS - Return TRUE if the specified entry exists in the table.
    COUNT - Returns the number of rows in the table .
    FIRST - Returns the index of the first in the table.
    LAST - Returns the index of last row in the table.
    NEXT - Returns the index of the last row in the table .
    PRIOR - Returns the index of previous row in the table before the specified row.

    Use NOCOPY hint to reduce overhead of passing collections in and out of program units






    -----------

    Feb 12, 2013

    NOSQL


              What is necessary for top-tier Web sites, according to proponents of NoSQL, is massive scalability, low latency, the ability to grow the capacity of your database on demand and an easier programming model. These, and others, are things which, according to them, SQL RDBMSes just don't provide in a cost-effective manner.

    NoSQL database systems are developed to manage large volumes of data that do not necessarily follow a fixed schema. Data is partitioned among different machines (for performance reasons and size limitations) so JOIN operations are not usable.Documents are addressed in the database via a unique key that represents that document.No SQl Suuports CAP Property (Consistency, Availability, Partition tolerance)

    C - Consistency
    ----------------
    A - Availability
    ----------------
    P - Partition tolerance
    --------------------
    Partial partition tolerance for these databases is obtained by mirroring database clusters between multiple data centers. The advantage these databases have over a traditional RDBMS is that with the work spread over all of those machines, you can achieve ultra-low latency even when there are extremely high numbers of reads and writes, and with all those machines, you can analyze massive amounts of data quickly.This is meaningless when the database is on a single server

    NoSQL implementations can be categorized by their manner of implementation:

    1) Document store

    Compared to relational databases, for example, collections could be considered as tables as well as documents could be considered as records.But they are different: every record in a table has the same sequence of fields, while documents in a collection could have fields that are completely different.Documents are addressed in the database via a unique key that represents that document. One of the other defining characteristics of a document-oriented database is that, beyond the simple key-document (or key-value) lookup that you can use to retrieve a document, the database will offer an API or query language that will allow you to retrieve documents based on their contents.
    Having keys and values are the so-called document databases. A document, in this case, is a collection of various fields of information. Each individual document can have a different number of fields of varying lengths. These databases are useful if you have a lot of semi-structured data, and they are a good fit for object-oriented programming models

    Eg :CouchDB, MongoDB

    2) Graph

    This kind of database is designed for data whose relations are well represented as a graph (elements interconnected with an undetermined number of relations between them). The kind of data could be social relations, public transport links, road maps or network topologies

    Eg: Neo4j, InfoGrid and HyperGraphDB

    3) Key-value store

    Key-value stores allow the application to store its data in a schema-less way. The data could be stored in a datatype of a programming language or an object.Each piece of data that goes into the database is given a key, and when you want the data back, you use the key to get

    4) XML databases

    It does not use SQL as its query language.
    It may not give full ACID guarantees.
    It has a distributed, fault-tolerant architecture

    5) Wide-column databases

    It tend to draw inspiration from Google's BigTable model.

    Eg: Cassandra, HBase

    Advantages of NoSQL
    -------------------------
    Elastic scaling

    Storage Type: Column Based NoSQL Databases, Document Based NoSQL Databases, Graph Based NoSQL Databases, Key-Value Based NoSQL Databases

    License Type: AGPL NoSQL Databases, Apache NoSQL Databases, BSD NoSQL Databases, GPL NoSQL Databases, Open Source NoSQL Databases, Proprietary NoSQL Databases

    Implementation Language: C NoSQL Databases, C++ NoSQL Databases, Erlang NoSQL Databases, Java NoSQL Databases, Python NoSQL Databases

    Data Storage: BDB NoSQL Databases, Disk NoSQL Databases, GFS NoSQL Databases, Hadoop NoSQL Databases, Plug-in NoSQL Databases, RAM NoSQL Databases, S3 NoSQL Databases

    Big Data


    Big Data is all about finding a needle of value in a haystack of unstructured information.Big data refers to large datasets that are challenging to store, search, share, visualize, and analyze.

    BigTable is a compressed, high performance, and proprietary data storage system built on Google File System .Big Table is Googles Database used for Google Reader,Google Maps,Google Book Search,My Search History,Google Earth, Blogger.com,Google Code hosting, Orkut,YouTube and Gmail.

    BigTable maps two arbitrary string values (row key and column key) and timestamp (hence three dimensional mapping) into an associated arbitrary byte array. It is not a relational database

    Relational databases (such as Oracle, MySQL, and SQL Server) versus newer non-relational databases (such as MongoDB, CouchDB, BigTable, and others).

    Big Data is not just about volume, the approach to analysis contends with data content and structure that cannot be anticipated or predicted.

    Feb 6, 2013

    Connection Pool


    A connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database .


    connection pooling is generally the practice of a middle tier (application server)

    getting N connections to a database (say 20 connections).

    These connections are stored in a pool in the middle tier, an "array" .Each connection is set to "not in use"
    When a user submits a web page to the application server, it runs a piece of your code,
    your code says "i need to get to the database", instead of connecting right there and
    then (that takes time), it just goes to this pool and says "give me a connection please".
    the connect pool software marks the connection as "in use" and gives it to you.

    Jan 17, 2013

    Modify Datatype


    Column having number(5) and Char(5) datatype for a particular table cant reduce its size even if that column contains data with length 1 .

    But can reduce the size of varchar/varchar2 even if the table is not empty .


    Column having number(5) ,varchar2(5) and Char(5) datatype for a particular table can increase size.