February 28, 2012

Frame for Exadata new Features

Make a note for Exadata new Features' frame:

A.Hybrid Columnar Compression
B.Exadata Smart Flash Cache                
C.Off Loading(*c1):
     1.Decompression(*c2)
    2.Decryption
    3.Rman: 1.Incremental Backups
                 2.Restores
     4.Datafile Initaliztion
    5.Simple Join
     6.Function
     7.Virural Column
     8.Smart Scan(*c3): 1.Predicate Filtering: =>1.Storage Index
                          2.Column Projection
                             


(*c1)Offloading three design goals:
• Reduce the volume of data transferred from disk systems to the database servers.
• Reduce CPU usage on database servers.
• Reduce disk access times at the storage layer.

(*c2)
Compression always Done at Database tier, while Decompression can be Done at storage tier.
Same to Encryption/Decryption.

(*c3)Four Necessary condition For Smart Scan:
• Must be Full Scan or Index Fast Full Scan
• Must be Direct Path Read: Parallel Query or Serial Query    
• Must have Predicate Filter or Column Projection
• Data must located on Exadata storage


Appendix A: Four kinds of HCC Compression Types
Name Level Type  Ratio
QUERY LOW HCC level 1 LZO
QUERY HIGH HCC  level 2 ZLIB
ARCHIVE LOW HCC level 3 ZLIB
QUERY HIGH      level 4 BZIP2 14×

Appendix B: Commands related to Flash Card
alter table kso.skew3 storage (cell_flash_cache keep);
CellCLI> create flashcache all;
CellCLI> list flashcache detail
CellCLI> list celldisk attributes name, diskType, size where name like 'FD.*'
CellCLI> LIST METRICCURRENT WHERE objectType = 'FLASHCACHE';

Appendix C: parameter to control offload related Features
alter session set cell_offload_processing=true;
alter session set "_serial_direct_read"=true;
alter system set "_kcfis_storageidx_disabled"=false;
alter session set "_bloom_predicate_pushdown_to_storage"=true;
alter session set "_cell_offload_virtual_columns"=true;
alter system set _DISABLE_CELL_OPTIMIZED_BACKUPS=true;

More......

February 5, 2012

serious performance issue caused by 11g new feature

Not long time ago, one of our new production database hits very serious performance issue after migration from 10g to 11g.

It appreas like: All SQLs execution plan not changed, only the execution time become ten times or more.

After go through AWR report it looks clear that: The IO system for the DB is overload.

The IO load reach to 400mb/s ,   and cost nearly 400ms per read.  The normal value should around 5ms-10ms.
Go through the AWR report, and find below TOP event cost 85%+ DB time:
direct path read.

Direct path read is an new feature in 11G.   In 11G, when a table is doing a full table scan,  oracle will make a judge and very possiblely choose direct path read instead of traditional db file scattered read.

In our enviroment, there is an simple SQL which don't have where filter,  so definitly oracle has to choose full table scan for it.  And oracle choosed direct path read.

This simple SQL is executed frequently, at the busiest time, more then 100+ sesssions run that SQL at same time.

Direct path read is not a problem, but when 100+ session doing direct path read at same time, then it is a big problem.
Because direct path read will skip the memory, hence make the IO load higher and higher and slower and slower.

And all other no-related SQLs which not doing a full table scan, when they need to read block from disk, will be also very slow because the IO system already nearly crashed.

Below is part of AWR report, for refer:


WaitEventWait TimeSummary Avg Wait Time (ms)
I#ClassEventWaits%TimeoutsTotal(s)Avg(ms)%DB timeAvgMinMaxStd DevCnt
*User I/Odirect path read1,373,9310.00481,162.28350.2181.58350.26348.14353.652.973
Otherreliable message32,84721.549,074.56276.271.54369.12152.35495.44188.583
User I/Odb file sequential read21,5580.008,367.24388.131.42387.98379.14394.277.883
System I/Ocontrol file sequential read26,3280.005,801.76220.360.98218.09211.52228.979.493
Commitlog file sync13,0940.005,346.94408.350.91447.49251.90658.57203.783
DB CPU4,250.920.723
User I/Odirect path write temp12,7420.003,505.86275.140.59399.88124.09539.61238.853
System I/Olog file parallel write13,0400.002,580.21197.870.44209.98142.42304.4084.263
System I/Odb file parallel write3,8960.00800.70205.520.14208.68172.50256.2443.013
Othergcs log flush sync70,78286.16691.439.770.129.819.519.990.263


From above we can find that all IO related operation turns into tragedy.  Even log file sync,  controlfile sequential read these no-related event, was dragged by the poor IO caused by direct path read.
More......