索引簇和哈希簇
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Evenly distributed key values :键值均衡的分布,适合哈希簇
当表中存在大量键值的时候,你就会开始发现由于存在许多簇块而导致的性能问题。避免这个问题的一个方法就是使用一个哈希函数来约束簇块的数量。哈希函数将会给定一个数值用来限定簇块数量的预计范围,但它得到的值是相对等分布的。
Rarely updated key:簇键不经常被修改,适合索引簇和哈希簇
Often joined master-detail tables:表经常被用作连接条件,并且是主要的数据源提供者,适合索引簇表,可以是一张表是普通的表,另一张表是簇表,进行连接查询,并且大量的数据来源于簇表。
Predictable number of key values:能够预先知道键值的数量,适合使用HASH簇,关键字hashkeys表示键值数量,size表示每个键值分配指定的空间。比如可以确定存放具有给定聚簇键值的所有记录所需的空间(包括现在的和将来的),则将此表以哈希聚簇存储。不要用哈希聚簇存储经常增长的表。
Queries using equality predicate on key:在查询时使用相等条件的(不单指等连接),比较适合哈希簇表
1)哈希簇表
create cluster credit_cluster
(
card_no varchar2(16),
transdate date sort
)
hashkeys 10000 hash is ora_hash(card_no)
size 256;
create table credit_orders
(
card_no varchar2(16),
transdate date,
amount number
)
cluster credit_cluster(card_no,transdate);
alter session set nls_date_format = "YYYYMMDDHH24MISS";
insert into credit_orders (card_no,transdate,amount)
values ('4111111111111111','20050131000123',57.99);
insert into credit_orders (card_no,transdate,amount)
values ('4111111111111111','20050130071216',16.59);
insert into credit_orders (card_no,transdate,amount)
values ('4111111111111111','20050131111111',39.00);
insert into credit_orders (card_no,transdate,amount)
values ('4111111111111111','20050130081001',25.16);
2)
SQL> select bytes/1024/1024,blocks from dba_segments where owner='SCOTT' and segment_name='CREDIT_CLUSTER';
BYTES/1024/1024 BLOCKS
--------------- ----------
3 384
3)
declare
counter number(17) :=4111111111111111;
begin
for i in 1..100 loop
counter :=counter+1;
insert into credit_orders (card_no,transdate,amount)
values (counter,sysdate,25.16);
end loop;
end;
4)
SQL> select bytes/1024/1024,blocks from dba_segments where owner='SCOTT' and segment_name='CREDIT_CLUSTER';
BYTES/1024/1024 BLOCKS
--------------- ----------
5 640
5)
SQL> select * from credit_orders where card_no='4111111111111196';
CARD_NO TRANSDATE AMOUNT
---------------- --------- ----------
4111111111111196 23-AUG-09 25.16
Execution Plan
----------------------------------------------------------
Plan hash value: 4016855751
-------
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 32 | 1 (0)| 00:00:01 |
|* 1 | TABLE ACCESS HASH| CREDIT_ORDERS | 1 | 32 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("CARD_NO"='4111111111111196')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
564 recursive calls
0 db block gets
171 consistent gets
274 physical reads
116 redo size
551 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
11 sorts (memory)
0 sorts (disk)
1 rows processed
==========================================
二、索引簇表
1)
create cluster credit_cluster_ind
(
card_no varchar2(16),
transdate date sort
)
hashkeys 10000 hash is ora_hash(card_no)
size 256;
create table credit_orders_ind
(
card_no varchar2(16),
transdate date,
amount number
)
cluster credit_cluster_ind(card_no);
2)
SQL> select bytes/1024/1024,blocks from dba_segments where owner='SCOTT' and segment_name='CREDIT_CLUSTER_IND';
BYTES/1024/1024 BLOCKS
--------------- ----------
.0625 8
3)
alter session set nls_date_format = "YYYYMMDDHH24MISS";
insert into credit_orders_ind (card_no,transdate,amount)
values ('4111111111111111','20050131000123',57.99);
insert into credit_orders_ind (card_no,transdate,amount)
values ('4111111111111111','20050130071216',16.59);
insert into credit_orders_ind (card_no,transdate,amount)
values ('4111111111111111','20050131111111',39.00);
insert into credit_orders_ind (card_no,transdate,amount)
values ('4111111111111111','20050130081001',25.16);
BYTES/1024/1024 BLOCKS
--------------- ----------
.0625 8
4)
declare
counter number(17) :=4111111111111111;
begin
for i in 1..100 loop
counter :=counter+1;
insert into credit_orders_ind (card_no,transdate,amount)
values (counter,sysdate,25.16);
end loop;
end;
5)
SQL> select bytes/1024/1024,blocks from dba_segments where owner='SCOTT' and segment_name='CREDIT_CLUSTER_IND';
BYTES/1024/1024 BLOCKS
--------------- ----------
.8125 104
select * from credit_orders_ind where card_no='4111111111111196';
6)
QL> select * from credit_orders_ind where card_no
='4111111111111196';
CARD_NO TRANSDATE AMOUNT
---------------- --------- ----------
4111111111111196 23-AUG-09 25.16
Execution Plan
----------------------------------------------------------
Plan hash value: 2142071904
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 64 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS CLUSTER| CREDIT_ORDERS_IND | 2 | 64 | 2 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | IND_CREDIT_ORDERS | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("CARD_NO"='4111111111111196')
Note
-----
- dynamic sampling used for this statement
Statistics
----------------------------------------------------------
458 recursive calls
0 db block gets
204 consistent gets
102 physical reads
4868 redo size
551 bytes sent via SQL*Net to client
380 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
8 sorts (memory)
0 sorts (disk)
1 rows processed
总结:从上面的例子可以看出,当创建了哈希簇表,那么oracle就会立刻给分配跟键值范围成比例的空间,所以比较适合可预知有多少数量的键值的表。
而索引簇没有立刻分配空间,它是随着键值的不同而且开始分配。
簇表可以理解为两张有相关联的表的数据进行集中存放,或是一张表的数据根据某个列的值进行聚簇存放,类似于把数据进行排序后存放。目的是可以减少io操作。