Monday, October 19, 2009

Latest storage Adventure

Add to Technorati Favorites

Ver este articulo en Español

Datamarts/Datawarehouse performance issues can be scaring events, taking into account the massive amount of data they handle; the tuning task is even more daunting if they're used for operational day-to-day decision making.

During the last week and days, I've been involved on a 15 Tb datamart with performance issues on dataloading, not querying. Tablespaces for indexes and data share filesystems, nowadays this is not a concern anymore, however I prefer to encourage customers to allocate independent storage for each type, that way they are able to discriminate the source of IO.

They have concurrent schedules for dataloading processes and parallelism degree for tables and indexes involved; a very good meassure to speed up data loading is to use /*+ APPEND */ hint on INSERT ... AS SELECT, which they're using also with PARALLEL. The parallelism has a multiplying factor and if you've experienced the downsides of it you may be suspecting when the wall is going to be hitted: at the storage level.

I generated a heat map for filesystems and focusing on write statistics found the correlation expected: very high average write times on filesystems starting at the time dataloading processes were scheduled... huge avg times around seconds!

After a quick chat with the Sysadmin and Storage guys, something very important arose: the striping factor for all those overheated filesystems was 2, said in other words: "storage you have, performance ... what's that?" That is a common mistake and results from working on almost isolated process, then is when I say: "Guys, you need to talk to each other, learn the jargon, the working of the "machines" the other teams deal with... end even the inner workings, why not?!"

After identified a set of objects that participated heavily on the filesystem overheating, brand new filesystems with striping factor of 8 were allocated and new tablespaces created to receive this objects. As expected, just the maintenance task of moving all tables/partitions/subpartitions took a breeze. Timings for uploading processes even on high load days improved dramatically... and all this for a subtle "configuration number" on the OS storage options.

Subscribe to Oracle Database Disected by Email
Bookmark this on Delicious

No comments:

Custom Search