Kane's Blog: serious performance issue caused by 11g new feature

February 5, 2012

serious performance issue caused by 11g new feature

Not long time ago, one of our new production database hits very serious performance issue after migration from 10g to 11g.

It appreas like: All SQLs execution plan not changed, only the execution time become ten times or more.

After go through AWR report it looks clear that: The IO system for the DB is overload.

The IO load reach to 400mb/s , and cost nearly 400ms per read. The normal value should around 5ms-10ms.
Go through the AWR report, and find below TOP event cost 85%+ DB time:
direct path read.

Direct path read is an new feature in 11G. In 11G, when a table is doing a full table scan, oracle will make a judge and very possiblely choose direct path read instead of traditional db file scattered read.

In our enviroment, there is an simple SQL which don't have where filter, so definitly oracle has to choose full table scan for it. And oracle choosed direct path read.

This simple SQL is executed frequently, at the busiest time, more then 100+ sesssions run that SQL at same time.

Direct path read is not a problem, but when 100+ session doing direct path read at same time, then it is a big problem.
Because direct path read will skip the memory, hence make the IO load higher and higher and slower and slower.

And all other no-related SQLs which not doing a full table scan, when they need to read block from disk, will be also very slow because the IO system already nearly crashed.

Below is part of AWR report, for refer:

I#	Class	Event	Waits	%Timeouts	Total(s)	Avg(ms)	%DB time	Avg	Min	Max	Std Dev	Cnt
	Wait		Event		Wait Time			Summary Avg Wait Time (ms)
*	User I/O	direct path read	1,373,931	0.00	481,162.28	350.21	81.58	350.26	348.14	353.65	2.97	3
	Other	reliable message	32,847	21.54	9,074.56	276.27	1.54	369.12	152.35	495.44	188.58	3
	User I/O	db file sequential read	21,558	0.00	8,367.24	388.13	1.42	387.98	379.14	394.27	7.88	3
	System I/O	control file sequential read	26,328	0.00	5,801.76	220.36	0.98	218.09	211.52	228.97	9.49	3
	Commit	log file sync	13,094	0.00	5,346.94	408.35	0.91	447.49	251.90	658.57	203.78	3
		DB CPU			4,250.92		0.72					3
	User I/O	direct path write temp	12,742	0.00	3,505.86	275.14	0.59	399.88	124.09	539.61	238.85	3
	System I/O	log file parallel write	13,040	0.00	2,580.21	197.87	0.44	209.98	142.42	304.40	84.26	3
	System I/O	db file parallel write	3,896	0.00	800.70	205.52	0.14	208.68	172.50	256.24	43.01	3
	Other	gcs log flush sync	70,782	86.16	691.43	9.77	0.12	9.81	9.51	9.99	0.26	3

From above we can find that all IO related operation turns into tragedy. Even log file sync, controlfile sequential read these no-related event, was dragged by the poor IO caused by direct path read.

4 Comments:

Anonymous said...: and what you suggest?
some where filters? indexes? disable direct read path? ;)

regards; February 13, 2012 at 3:50 PM
Kane Zhang said...: If the SQL can be added with a where filter to avoid full table scan, that is the best way.

But sometimes the application itself's logic require a Full Table Scan, no where filter optioanal.

Then the only way is to disable direct path read.
Two way to disable direct path read:
1. set event 10949 trace
2. cache the table.

In our enviroment, we choose the second. After cached the table, oracle will be very smart to know that it should read the table from memory instead of read from disk.; February 14, 2012 at 8:53 PM
Anonymous said...: Not a strong argument Mr. Zhang. How many tables can you cache in memory? If you have 100 tables with millions of rows that performance have been affected by your scenario, will you cache all the tables?; January 23, 2013 at 1:54 PM
Kane Zhang said...: Hi,

It depends.
You can cache the table if the issue only occur on only a few small tables.
Or you can disable on global level with 10949 event.

Kevin; February 4, 2013 at 12:13 AM

Kane's Blog

Pages

February 5, 2012

serious performance issue caused by 11g new feature

4 Comments:

Post a Comment