Thursday, April 24, 2008

Honey, I shrunk the indexes - Part 2

Ver este articulo en Español

How to pick the perfect candidates


Warning: index compression may change execution plans and affect performance, try this on a test database and check if application SQL and PL/SQL code execution improves.

Not every index must be compressed, sometimes compression will give minimal space reductions, that don't compensate the overhead incurred. But how do we know that?

I may suggest two approaches:
1) Size oriented
2) Access frequency oriented

Size Oriented

Let's start with size oriented picking, saying: the bigger the elephant is, better results will get when on diet.

I've used a script like this to get my list of candidates for shrinking:

SELECT
substr(segment_name,1,20) as index_name,
bytes, blocks, extents
FROM dba_segments
WHERE owner = '{write here the owner}'
AND segment_type = 'INDEX'
AND extents > 63 <---this you may change order by bytes desc;
After running above script, you'll get a listing like this:

INDEX NAME BYTES BLOCKS EXTENTS
-------------------- ---------- ---------- ----------
PROD_NAME_IX 524288 64 8
PRD_DESC_PK 327680 40 5
SYS_C009603 131072 16 2
SYS_C009607 65536 8 1
SYS_C009606 65536 8 1
ACTION_TABLE_MEMBERS 65536 8 1
LINEITEM_TABLE_MEMBE 65536 8 1
SYS_C009602 65536 8 1

Now you have to forecast the best compression ratio for your index, and there is a feature very accurate for doing so: ANALYZE the index.

Despite the fact that analyzing tables or objects have deprecated the statistics purpose (one of them), we may use this sentence to test structure. Following command and a quick query to INDEX_STATS will show us if the selected index is a best fit, which compression order to choose and expected size reduction:


SQL> ANALYZE INDEX owner.index_name VALIDATE STRUCTURE OFFLINE;

Index Analyzed

SQL> SELECT name, height, blocks, OPT_CMPR_COUNT, OPT_CMPR_PCTSAVE
FROM index_stats
WHERE name = '{index_name}';

The resulting value OPT_CMPR_COUNT is the value you specify for COMPRESS {n} clause, and OPT_CMPR_PCTSAVE is the "expected" compression ratio for that value. All other values from INDEX_STATS are present figures.

Then your sentences may look like this:

SQL> ALTER INDEX owner.index_name REBUILD COMPRESS {value from OPT_CMPR_COUNT}

or

SQL> CREATE INDEX owner.index_name ON {table_index_clause}
2: TABLESPACE {Tablespace Name}
3: COMPRESS {value from OPT_CMPR_COUNT}
4: {other storage clauses};

Second approach: Access Frequency

For this we're going to need the help of two important views: V$SEGSTAT(9i and up) and ALL_OBJECTS. We need V$SEGSTAT because that dynamic view will show us valuable statistics regarding logical reads/writes or physical reads/writes. Following script is proposed as an aid to find the top used indexes within a schema.

SELECT a.object_name, b.statistic_name, b.value
FROM all_objects a, v$segstat b
WHERE a.object_id = b.obj#
AND a.owner = '{your schema owner here}'
AND a.object_type = 'INDEX'
AND b.statistic_name = 'physical reads' <-- You may change this for physical reads direct ORDER by b.value desc
Above query will give you a list of candidates for compression, now you have to apply the ANALYZE and check if there are good space reductions that 'may' correspond to less IO.

Jump to Part III Honey, I shrunk the indexes - Part 3: Index Compression is good or evil?

View starting post: Honey, I shrunk the indexes - Part 1

Add to Technorati Favorites

Ver este articulo en Español

Subscribe to Oracle Database Disected by Email

Follow IgnacioRuizMX on Twitter

Thursday, April 10, 2008

Honey, I shrunk the indexes

Ver este articulo en Español

Introduction

There is an Oracle feature that may provide savings in space and IO, have you heard of "index key compression"? well, this feature is with us since 8i, but for many, it's obscure or unknown.

Despite the 10g storage management optimizations, always there is gain from index maintenance. If you do index checks regularly you're a good DBA... but if you don't, better take care from now on.

Adding the gains from index rebuild or shrink, you may consider compressing some well-picked indexes, for which the space savings and IO reductions overcome the slight(?) cpu overhead it causes. I wrote a question mark after 'slight' because we will try to estimate that cost in the short term.

I'll propose this starting questions:
* How do you use index key compression?
* What are the first sight results?
* How to pick the best candidates for compression?
* Index compression is good or evil... or both?
* What is the benefit/cost after the shrinking?
* What are the "inside" results or how to analyze the effect on my present queries?

If you have more questions, please feel free to drop a comment and we (all of you and I, because as far as I know I don't have multiple personality disorder) will try to tackle and provide a satisfactory answer.

How do you use index key compression?

There are two ways to accomplish this:
1) drop the index, create it again with COMPRESS
2) rebuild the index with COMPRESS

I will try the second method, with this huge index I've on a test database. These are the starting figures:






TABLE_ROWS TABLE_BLOCKS INDEX_BLOCKS INDEX_BYTES BLEVEL LEAF_BLOCKS
---------- ------------ ------------ ------------- ------ -----------
7,331,706 459,210 155,648 1,275,068,416 3 149,394

Now that we have our baseline, it's time to issue the DDL sentence that will reorganize the index:



SQL> ALTER INDEX idx_big_comp_test REBUILD COMPRESS 2;
Index Rebuild



After that statement our figures are now the following:



TABLE_ROWS   TABLE_BLOCKS   INDEX_BLOCKS   INDEX_BYTES    BLEVEL  LEAF_BLOCKS
---------- ------------ ------------ ------------- ------ -----------
7,331,706 459,210 139,904 1,146,093,568 3 133,682

A quick comparison yields less index blocks and leaf blocks (which is logical and obvious), accounting for 10.5% of space savings.

Let our imagination fly, we're showing our boss the way to extend the out-of-disk due date or justifying a well earned salary rise derived from storage savings. Back to reality... in this life everything has a price, don't rush and compress every index in your databases until we talk about pros and cons, and learn how to choose good candidates for compression.

Jump to Part II Honey, I shrunk the indexes - Part 2: How to pick the perfect candidates


Add to Technorati Favorites

Ver este articulo en Español/Look for this content in spanish

Subscribe to Oracle Database Disected by Email

Thursday, April 3, 2008

Recover that lost space

Do you know what happens when you issue a DELETE? Have you wondered what happen after COMMIT? ... well, I might say that there is life after delete for all those blocks that belonged to the data recently deleted.

I'll present the following test case to show you how this works:

First we are going to create our test environment, borrowing a table from the Examples OE schema, creating it in the SCOTT schema.


SQL> alter session set current_schema=SCOTT;

SQL> create table pd as select * from oe.PRODUCT_DESCRIPTIONS;

Let's see how many blocks our new table PD is using

SQL> column segment_name format a20
2: select segment_name, sum(blocks)
3: from dba_Segments where owner = 'SCOTT' and segment_name = 'PD'
4: group by segment_name
5: order by sum(blocks) desc;

SEGMENT_NAME SUM(BLOCKS)
-------------------- -----------
PD 384

Now we're going to see how it's structured the table:

SQL> select column_name,
2: substr(data_type,1,20) as dt,
3: data_length
4: from all_tab_columns
5: where table_name = 'PD';

COLUMN_NAME DT DATA_LENGTH
------------------------------ -------------------- -----------
PRODUCT_ID NUMBER 22
LANGUAGE_ID VARCHAR2 3
TRANSLATED_NAME NVARCHAR2 100
TRANSLATED_DESCRIPTION NVARCHAR2 4000

That structure yield us a theoretical row length of (22+3+100+4000) = 4125, which if used fully will give us just 1 row per block (see below for DB blocksize). But , rarelly a NVARCHAR2 is fully used.

SQL> show parameters db_block_size

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_block_size integer 8192

We know there is data and how it's structured, but we need something to erase... the following may be politically incorrect but is only for instructional purposes:

SQL> select language_id, count(*) from pd group by language_id;

LAN COUNT(*)
--- ----------
US 288
IW 288
TR 288
...
S 288
SK 288
ZHS 288

30 rows selected.

We're going to delete all the 'S' language registers, I don't know what means 'S' and I'm sorry if I hurt nationalistic feelings, remember this is hypotetical.

SQL> delete from pd where LANGUAGE_ID = 'S';

288 rows deleted.

SQL> commit;

Commit complete.

That was the death sentence for those rows, now will see what happened with the blocks assigned to PD.

SQL> column segment_name format a20
2: select segment_name, sum(blocks)
3: from dba_Segments where owner = 'SCOTT' and segment_name = 'PD'
4: group by segment_name
5: order by sum(blocks) desc;

SEGMENT_NAME SUM(BLOCKS)
-------------------- -----------
PD 384

Wow! the table still has the blocks assigned, and will not return them until a TRUNCATE, DROP ... or ALTER TABLE is issued. Yes! an alter table will return the extents to the tablespace free pool.

SQL> alter table scott.pd deallocate unused;

Table altered.

SQL> column segment_name format a20
2: select segment_name, sum(blocks)
3: from dba_Segments where owner = 'SCOTT' and segment_name = 'PD'
4: group by segment_name
5: order by sum(blocks) desc;

SEGMENT_NAME SUM(BLOCKS)
-------------------- -----------
PD 376

There it is!!! the blocks that contained all the 'S' language registers, now don't belong to the table PD.

At first you may be tempted to include a periodic ALTER TABLE...DEALLOCATE UNUSED in order to recover all that 'wasted' space, you must be carefull when evaluating this, because there are tables that get data erased and never or rarelly get new data, them are perfect candidates for this.

On the other hand, tables with heavy delete activity won't benefit too much because of the overhead related to update the internal tables that keep track of blocks and extents. What is 'heavy' activity, depends on the deleted rows/total rows rate, and you must develop a maintenance criteria to exclude tables.

You're welcome to send your comments... I hope this help you

Please, before you go, don't forget to vote the poll regarding the content of this blog, thank you!
Subscribe to Oracle Database Disected by Email
Custom Search