网站建设知识
MySQL学习笔记第3课(共10课)
2025-07-22 10:00  点击:0

20、分区表

MySQL数据库支持的分区类型为水平分区,并不支持垂直分区,此外MySQL数据库的分区是局部分区索引,一个分区中即存放了数据又存放了索引。

分区主要用于高可用性,利于数据库的管理。在OLTP应用中,对于分区的使用应该小心。

水平分区:指同一表中不同行的记录分配到不同的物理文件中。

查看当前数据库是否开启了分区功能:

mysql> showvariables like '%partition%';

+-------------------+-------+

|Variable_name | Value |

+-------------------+-------+

| have_partitioning| YES |

+-------------------+-------+

mysql> showplugins;

+-----------------------+----------+--------------------+---------+-------------+

| Name | Status | Type | Library | License |

+-----------------------+----------+--------------------+---------+-------------+

| binlog | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

|mysql_native_password | ACTIVE |AUTHENTICATION | NULL | PROPRIETARY |

|mysql_old_password | ACTIVE | AUTHENTICATION | NULL | PROPRIETARY |

| MRG_MYISAM | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| CSV | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| MyISAM | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| MEMORY | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| InnoDB | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| INNODB_TRX | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

|INNODB_LOCKS | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

|INNODB_LOCK_WAITS | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

| INNODB_CMP | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

|INNODB_CMP_RESET | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

|INNODB_CMPMEM | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

|INNODB_CMPMEM_RESET | ACTIVE | INFORMATION SCHEMA | NULL | PROPRIETARY |

|PERFORMANCE_SCHEMA | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| FEDERATED | DISABLED | STORAGE ENGINE | NULL | PROPRIETARY |

| ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

| partition | ACTIVE | STORAGE ENGINE | NULL | PROPRIETARY |

+-----------------------+----------+--------------------+---------+-------------+

不论创建何种类型的分区,如果表中存放主键或者是唯一索引时,分区列表必须是唯一索引的一个组成部分。

唯一索引可以是允许NULL值的,并且分区列只能是唯一索引的一个组成部分,不需要整个唯一索引列都是分区列。普通索引可以不作为分区列。

(1)RANGE分区

mysql> createtable t (

->id int) engine=innodb

->partition by range (id)(

->partition p0 values less than (10),

->partition p1 values less than (20));

mysql> systemls -lh /home/mysql/data/test

总计 208K

-rw-rw---- 1mysql mysql 8.4K 2012-09-04 10:11 t.frm

-rw-rw---- 1mysql mysql 28 2012-09-04 10:11 t.par

-rw-rw---- 1mysql mysql 96K 2012-09-04 10:11t#P#p0.ibd

-rw-rw---- 1mysql mysql 96K 2012-09-04 10:11t#P#p1.ibd

查看partitions表来查看每个分区的具体信息:

mysql> select* from information_schema.partitions where table_schema=database() andtable_name='t'\G

***************************1. row ***************************

TABLE_CATALOG: def

TABLE_SCHEMA: test

TABLE_NAME: t

PARTITION_NAME: p0

SUBPARTITION_NAME: NULL

PARTITION_ORDINAL_POSITION: 1

SUBPARTITION_ORDINAL_POSITION:NULL

PARTITION_METHOD: RANGE

SUBPARTITION_METHOD: NULL

PARTITION_expression: id

SUBPARTITION_expression: NULL

PARTITION_DEscriptION: 10

TABLE_ROWS: 1

AVG_ROW_LENGTH: 16384

DATA_LENGTH: 16384

MAX_DATA_LENGTH: NULL

INDEX_LENGTH: 0

DATA_FREE: 0

CREATE_TIME: NULL

UPDATE_TIME: NULL

CHECK_TIME: NULL

CHECKSUM: NULL

PARTITION_COMMENT:

NODEGROUP: default

TABLESPACE_NAME: NULL

***************************2. row ***************************

TABLE_CATALOG: def

TABLE_SCHEMA: test

TABLE_NAME: t

PARTITION_NAME: p1

SUBPARTITION_NAME: NULL

PARTITION_ORDINAL_POSITION: 2

SUBPARTITION_ORDINAL_POSITION:NULL

PARTITION_METHOD: RANGE

SUBPARTITION_METHOD: NULL

PARTITION_expression: id

SUBPARTITION_expression: NULL

PARTITION_DEscriptION: 20

TABLE_ROWS: 2

AVG_ROW_LENGTH: 8192

DATA_LENGTH: 16384

MAX_DATA_LENGTH: NULL

INDEX_LENGTH: 0

DATA_FREE: 0

CREATE_TIME: NULL

UPDATE_TIME: NULL

CHECK_TIME: NULL

CHECKSUM: NULL

PARTITION_COMMENT:

NODEGROUP: default

TABLESPACE_NAME: NULL

不在区间定义的值,报错

mysql> insertinto t select 30;

ERROR 1526(HY000): Table has no partition for value 30

可以添加一个maxvalue值的区间,可以理解为正无穷。

mysql> altertable t add partition (

->partition p2 values less than maxvalue);

mysql> insertinto t select 30;

Query OK, 1 rowaffected (0.00 sec)

其实range分区主要用于日期列的分区,举例:

mysql> createtable sales (

->money int unsigned not null,

->date datetime) engine=innodb

->partition by range (YEAR(date)) (

->partition p2008 values less than(2009),

->partition p2009 values less than(2010),

->partition p2010 values less than(2011)

-> );

mysql> insertinto sales select 100,'2008-01-01';

Query OK, 1 rowaffected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> insertinto sales select 100,'2009-02-01';

Query OK, 1 rowaffected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> insertinto sales select 200,'2008-01-02';

Query OK, 1 rowaffected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> insertinto sales select 100,'2009-03-01';

Query OK, 1 rowaffected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> insertinto sales select 200,'2010-03-01';

Query OK, 1 rowaffected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

这样创建的好处是:便于对sales这张表的管理。如果我们要删除2008年的数据,就不需要执行执行delete from sales wheredate >= ‘2008-01-01’ and date <’2009-01-01’,而是只需要删除2008年的数据所在的分区即可:

mysql> altertable sales drop partition p2008;

Query OK, 0 rowsaffected (0.24 sec)

Records: 0 Duplicates: 0 Warnings: 0

这样创建的另一个好处是,可以加快某些查询的操作。

mysql> explainpartitions select * from sales where data >= '2008-01-01' and

->data<= '2008-12-31'\G

***************************1. row ***************************

id: 1

select_type: SIMPLE

table: sales

partitions: p2009

type: ALL

possible_keys:NULL

key: NULL

key_len: NULL

ref: NULL

rows: 2

Extra: Using where

可以看到SQL优化器只需要去搜索p2008这个分区,而不会去搜索其他所有的分区,因此大大地提高了查询的速度。

考虑下面的一种情况,设计者的原意是想可以按照每年每月来进行分区,

mysql> createtable sales (

->money int unsigned not null,

->date datetime

-> )

->engine=innodb

->partition by range(YEAR(date)*100+MonTH(date)) (

->partition p201001 values less than(201002),

->partition p201002 values less than(201003),

->partition p201003 values less than(201004)

-> );

但是在执行SQL语句时,开发人员发现,优化器不会根据分区进行选择:

mysql> explainpartitions select * from sales where date >= '2010-01-01' and date <='2010-01-31'\G

***************************1. row ***************************

id: 1

select_type: SIMPLE

table: sales

partitions:p201001,p201002,p201003

type: ALL

possible_keys:NULL

key: NULL

key_len: NULL

ref: NULL

rows: 3

Extra: Using where

可以看到优化对三个分区都进行了搜索。产生这个问题的主要原因就是,对于range分区的查询,优化器只能对YEAR()、TO_DAYS()、TO_SEConDS()、UNIX_TIMESTAMP()这类函数进行优化选择,因此可以将上面的函数改为TO_DAYS,如下:

(2)LIST分区

和range分区非常类似,只是分区的值是离散的,而非连续的。

mysql> createtable t (

->a int,

->b int) engine=innodb

->partition by list(b) (

->partition p0 values in (1,3,5,7,9),

->partition p1 values in (0,2,4,6,8)

-> );

不同于range分区中定义的valuesless than语句,LIST分区使用values in,所以每个分区的值是离散的,只能是定义的值。

mysql> selecttable_name,partition_name,table_rows from information_schema.partitions where

-> table_name = 't' and table_schema =database()\G

***************************1. row ***************************

table_name: t

partition_name:p0

table_rows: 2

***************************2. row ***************************

table_name: t

partition_name:p1

table_rows: 2

如果插入的值不在分区范围内,同样会报错:

mysql> insertinto t select 1,10;

ERROR 1526(HY000): Table has no partition for value 10

对于innoDB存储引擎的分区表,下面的插入操作(作为一个事务)不会插入任何数据:

mysql> insertinto t values (1,2),(2,4),(6,10),(5,3);

ERROR 1526(HY000): Table has no partition for value 10

但是对于MyISAM存储引擎的表,会将之前的行数据都插入,但之后的数据不会被插入。

mysql> createtable t (

->a int,

->b int) engine=myisam

->partition by list(b) (

->partition p0 values in (1,3,5,7,9),

->partition p1 values in (0,2,4,6,8));

Query OK, 0 rowsaffected (0.18 sec)

mysql> insertinto t values (1,2),(2,4),(6,10),(5,3);

ERROR 1526(HY000): Table has no partition for value 10

mysql> select* from t;

+------+------+

| a | b |

+------+------+

| 1 | 2 |

| 2 | 4 |

+------+------+

因此在使用分区时,也需要对不同存储引擎支持的事务特性进行考虑。

(3)HASH分区

HASH分区的目的是将数据均匀地分布到预先定义的各个分区中,保证各分区的数据数量都是一样的。

要使用HASH分区来分隔一个表,要在createtable语句上添加一个partitionby hash (expr)子句,其中expr是一个返回一个整数的表达式。

mysql> createtable t_hash (

->a int,

->b datetime)

->engine=innodb

->partition by hash (YEAR(b))

->partitions 4;

mysql> selectmod(year('2010-04-01'),4);

+---------------------------+

|mod(year('2010-04-01'),4) |

+---------------------------+

| 2 |

+---------------------------+

如果将2010-04-01这个记录插入表t_hash中,那么保存该条记录的分区确定如上,放入分区2;

如下验证:

mysql> insertinto t_hash select 1,'2010-04-01';

Query OK, 1 rowaffected (0.06 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> selecttable_name,partition_name,table_rows from information_schema.partitions where

-> table_schema = database() andtable_name = 't_hash'\G

***************************1. row ***************************

table_name: t_hash

partition_name:p0

table_rows: 0

***************************2. row ***************************

table_name: t_hash

partition_name:p1

table_rows: 0

***************************3. row ***************************

table_name: t_hash

partition_name: p2

table_rows: 1

***************************4. row ***************************

table_name: t_hash

partition_name:p3

table_rows: 0

4 rows in set(0.00 sec)

MySQL数据库还支持一种称为LINEAR HASH的分区,它使用一种更加复杂的算法来确定新行插入已经分区的表中的位置。

mysql> createtable t_linear_hash (

->a int,

->b datetime)

->engine=innodb

->partition by linear hash (year(b))

->partitions 4;

进行分区的判断算法如下:

1)取大于分区数量4的下一个2的幂值V,V=power(4,ceiling(log(2,num))) = 4;

2)所在的分区N=YEAR(‘2010-04-01’) &(V-1) = 2

虽然还是分区2,但是计算的算法和之前不一样。

LINEAR HASH分区的优点在于:增加,删除,合并和拆分分区将变得更加快捷,这有利于处理含有大量数据的表。缺点是:与使用HASH分区得到的数据分别相比,各个分区间数据的分布可能不大均衡。

mysql> insertinto t_linear_hash select 1,'2010-04-01';

mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_schema = 'test' and table_name = 't_linear_hash';

+---------------+----------------+------------+

| table_name | partition_name | table_rows |

+---------------+----------------+------------+

| t_linear_hash |p0 | 0 |

| t_linear_hash |p1 | 0 |

| t_linear_hash | p2 | 1 |

| t_linear_hash |p3 | 0 |

+---------------+----------------+------------+

(4)KEY分区

KEY分区使用MySQL数据库提供的函数进行分区,而HASH使用用户定义的函数进行分区。

mysql> createtable t_key (

->a int,

->b datetime ) engine=innodb

->partition by key (b)

->partitions 4;

在KEY分区中使用关键字LINEAR,和在HASH分区中具有同样的作用,分区的编号是通过2的幂算法得到的。

21、COLUMNS分区

我单独作为一节介绍,是因为这个很重要,MySQL数据库版本5.5.0开始支持COLUMNS分区,对于之前的RANGE和LIST分区,我们应该使用RANGE COLUMNS和LIST COLUMNS分区进行很好的代替。

背景:

在前面介绍的四种分区中,分区的条件都必须是整数,如果不是整数则需要转化为整数。

发展:

COLUMNS分区可以直接使用非整数的数据进行分区,分区根据类型直接比较而得,不需要转化为整数。其次,RANGE COLUMNS分区可以对多个列的值进行分区。

COLUMNS分区支持以下的数据类型:

1)所有的整数类型,如int,smallint,tinyint,bigint。float和decimal则不予支持。

2)字符串类型,如char,varchar,binary和varbinary。blob和text类型不予支持。

3)日期类型,如date和datetime。其余的日期类型不予支持。

特别,对于日期类型的分区,我们不再需要year()和to_days()函数了,而直接可以使用COLUMNS,如:

mysql> createtable t_columns_range(

->a int,

->b datetime) engine=innodb

->partition by range columns (b) (

->partition p0 values less than('2009-01-01'),

->partition p1 values less than('2010-01-01')

-> );

同样可以使用字符串的分区:

mysql> createtable customers (

->first_name varchar(25),

->last_name varchar(25),

->street_1 varchar(30),

->street_2 varchar(30),

->city varchar(15),

->renewal date

-> )

->partition by list columns(city) (

->partition region1 values in('oskarshamn','hosgby','monsrtea'),

->partition region2 values in('vimmerby','hults','vsafsri'),

->partition region3 values in('nossjo','ekfs','vetlanda'),

->partition region4 values in('uppviding','afsf','vafaf')

-> );

mysql> insertinto customers select 'shouzhuang','jiang','china','jiangsu','nossjo','2012-09-04';

mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_schema = 'test' and table_name = 'customers';

+------------+----------------+------------+

| table_name |partition_name | table_rows |

+------------+----------------+------------+

| customers | region1 | 0 |

| customers | region2 | 0 |

| customers | region3 | 1 |

| customers | region4 | 0 |

+------------+----------------+------------+

对于RANGE COLUMNS分区,可以使用多个列进行分区,如:

mysql> createtable rcx (

->a int,

->b int,

->c char(3),

->d int

-> )

->partition by range columns (a,b,c) (

->partition p0 values less than(5,10,'ggg'),

->partition p1 values less than(10,20,'mmmm'),

->partition p2 values less than(15,30,'sss'),

->partition p3 values less than(maxvalue,maxvalue,maxvalue)

-> );

mysql> insertinto rcx select 1,2,'jia',12;

mysql> insertinto rcx select 12,24,'zzz',18;

mysql> select table_name,partition_name,table_rowsfrom information_schema.partitions where table_schema = 'test' and table_name ='rcx';

+------------+----------------+------------+

| table_name |partition_name | table_rows |

+------------+----------------+------------+

| rcx | p0 | 1 |

| rcx | p1 | 0 |

| rcx | p2 | 1 |

| rcx | p3 | 0 |

+------------+----------------+------------+

22、子分区

MySQL允许在range和list的分区上再进行hash或者是key的子分区,如:

mysql> createtable ts ( a int, b date )

->engine=innodb

->partition by range (year(b))

->subpartitionby hash(to_days(b))

->subpartitions2 (

-> partition p0 values less than (1990),

->partition p1 values less than (2000),

->partition p2 values less than maxvalue);

Query OK, 0 rowsaffected (0.36 sec)

查看物理文件:

mysql> systemls -lh /home/mysql/data/test

总计 592K

-rw-rw---- 1mysql mysql 8.4K 2012-09-04 15:07 ts.frm

-rw-rw---- 1mysql mysql 96 2012-09-04 15:07 ts.par

-rw-rw---- 1 mysql mysql 96K2012-09-04 15:07 ts#P#p0#SP#p0sp0.ibd

-rw-rw---- 1 mysql mysql 96K2012-09-04 15:07 ts#P#p0#SP#p0sp1.ibd

-rw-rw---- 1 mysqlmysql 96K 2012-09-04 15:07ts#P#p1#SP#p1sp0.ibd

-rw-rw---- 1 mysqlmysql 96K 2012-09-04 15:07ts#P#p1#SP#p1sp1.ibd

-rw-rw---- 1 mysql mysql 96K 2012-09-04 15:07 ts#P#p2#SP#p2sp0.ibd

-rw-rw---- 1 mysql mysql 96K 2012-09-04 15:07 ts#P#p2#SP#p2sp1.ibd

也可以用SUBPARTITION语法来显示指出各个子分区的名称,

mysql> create table ts (a int,b date) engine=innodb

->partition by range(year(b))

->subpartition byhash(to_days(b)) (

->partition p0 values less than (1990) (

->subpartition s0,

->subpartition s1

-> ),

->partition p1 values less than (2000) (

->subpartition s2,

->subpartitions3

-> ),

->partition p2 values less than maxvalue (

->subpartition s4,

->subpartition s5

-> )

-> );

需要注意的几点:

1)每个子分区的数量必须相同。

2)如果在一个分区表上的任何分区上使用subpartition来明确定义任何子分区,那么就必须定义所有的子分区。

3)每个subpartition子句必须包括子分区的一个名称

4)在每个子分区内,子分区的名称必须是唯一的。

子分区可以用于特别大的表,在多个磁盘间分别分配数据和索引。

假如有6个磁盘,分别为disk0,disk1,disk2等,如下:

create table ts (a int, b date) engine=myisam

partition by range (year(b))

subpartition by hash (to_days(b)) (

partition p0 values less than (2000) (

subpartition s0

data directory = '/disk0/data'

index directory = '/disk0/idx',

subpartition s1

data directory = '/disk1/data'

index directory = '/disk1/idx'

),

partition p1 values less than (2010) (

subpartition s2

data directory = '/disk2/data'

index directory = '/disk2/idx',

subpartition s3

data directory = '/disk3/data'

index directory = '/disk3/idx'

),

partition p1 values less than maxvalue (

subpartition s4

data directory = '/disk4/data'

index directory = '/disk4/idx',

subpartition s5

data directory = '/disk5/data'

index directory = '/disk5/idx'

)

);

mysql> show warnings;

+---------+------+----------------------------------+

| Level |Code | Message |

+---------+------+----------------------------------+

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

| Warning | 1618 | optionignored |

+---------+------+----------------------------------+

但是InnoDB存储引擎会忽略DATA DIRECTORY和INDEX DIRECTORY语法,因此上述分区表的数据和索引文件分开设置对其是无效的。

23、分区中的NULL值

MySQL数据库中允许对NULL值做分区,MySQL数据库总是把NULL值视为小于任何一个非NULL值。

对于range分区,如果对于分区列插入了NULL值,则MySQL数据库会将该值放入最左边的分区。

mysql> create table t_range (

->aint,

->bint) engine=innodb

->partition by range(b) (

->partition p0 values less than (10),

->partition p1 values less than (20),

->partition p2 values less than maxvalue);

Query OK, 0 rows affected (0.19 sec)

mysql> insert into t_range select 1,1;

Query OK, 1 row affected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> insert into t_range select 1,NULL;

Query OK, 1 row affected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> select * from t_range;

+------+------+

| a |b |

+------+------+

| 1 | 1 |

| 1 | NULL|

+------+------+

2 rows in set (0.00 sec)

mysql> select table_name,partition_name,table_rowsfrom information_schema.partitions where table_name = 't_range' andtable_schema = database()\G

*************************** 1. row***************************

table_name: t_range

partition_name: p0

table_rows: 2

*************************** 2. row***************************

table_name: t_range

partition_name: p1

table_rows: 0

*************************** 3. row***************************

table_name: t_range

partition_name: p2

table_rows: 0

3 rows in set (0.01 sec)

可以看到数据都放入了p0分区。NULL值会放入最左边的分区中。

另外需要特别注意的是:如果删除了p0这个分区,你删除的是小于10的记录,并且还有NULL值的记录,这点非常重要。

mysql> alter table t_range drop partition p0;

Query OK, 0 rows affected (0.11 sec)

Records: 0 Duplicates: 0 Warnings: 0

mysql> select * from t_range;

Empty set (0.00 sec)

LIST分区下要使用NULL值,则必须显示地指出哪个分区中放入NULL值,否则会报错,如:

mysql> create table t_list (

->aint,

->bint) engine=innodb

->partition by list(b) (

->partition p0 values in (1,3,5,7,9),

->partitionp1 values in (0,2,4,6,8)

-> );

Query OK, 0 rows affected (0.13 sec)

mysql> insert into t_list select 1,NULL;

ERROR 1526 (HY000): Table has no partition for value NULL

若p0分区允许NULL值,则插入不会报错:

mysql> create table t_list (

->aint,

->b int)engine=innodb

->partition by list(b) (

->partition p0 values in (1,3,5,7,9,NULL),

->partition p1 values in (0,2,4,6,8)

-> );

mysql> insert into t_list select 1,NULL;

Query OK, 1 row affected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_name = 't_list' and table_schema = database()\G

*************************** 1. row***************************

table_name: t_list

partition_name: p0

table_rows: 1

*************************** 2. row***************************

table_name: t_list

partition_name: p1

table_rows: 0

2 rows in set (0.00 sec)

HASH和KEY分区对于NULL值的处理方式,和RANGE分区,LIST分区不一样。任何分区函数都会将含有NULL值的记录返回为0。

mysql> create table t_hash (

->aint,

->bint) engine=innodb

->partition by hash(b)

->partitions 4;

mysql> insert into t_hash select 1,0;

Query OK, 1 row affected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> insert into t_hash select 1,NULL;

Query OK, 1 row affected (0.00 sec)

Records: 1 Duplicates: 0 Warnings: 0

mysql> selecttable_name,partition_name,table_rows from information_schema.partitions wheretable_name = 't_hash' and table_schema = database()\G

*************************** 1. row ***************************

table_name: t_hash

partition_name: p0

table_rows: 2

*************************** 2. row***************************

table_name: t_hash

partition_name: p1

table_rows: 0

*************************** 3. row***************************

table_name: t_hash

partition_name: p2

table_rows: 0

*************************** 4. row***************************

table_name: t_hash

partition_name: p3

table_rows: 0

4 rows in set (0.00 sec)

24、分区和性能

数据库的应用分为两类:

一类是OLTP,在线事务处理,如博客,电子商务,网络游戏等

另一类是OLAP,在线分析处理,如数据仓库,数据集市。

对于OLAP的应用,分区的确可以很好地提高查询的性能,因为OLAP应用的大多数查询需要频繁地扫描一张很大的表。

对于OLTP的应用,分区应该非常小心。在这种应用下,不可能会获取一张大表中10%的数据,大部分都是通过索引返回几条记录即可,而根据B+树索引的原理可知,对于一张大表,一般的B+树需要2~3次的磁盘IO,很少会有4层的B+树索引,因此B+树可以很好地完成操作,不需要分区的帮助,并且设计不好的分区会带来严重的性能问题。

实验:

mysql> create table profile (

->idint(11) not null auto_increment,

->nickname varchar(20) not null default '',

->password varchar(32) not null default '',

->sexchar(1) not null default '',

->rdate date not null default '0000-00-00',

->primary key (id),

->keynickname (nickname)

-> )engine=innodb

->partition by hash(id)

->partitions 10;

mysql> \d //

mysql> create procedure jiang(count intunsigned)

->begin

->declare i int unsigned default 1;

->while i < count do

->insert into profile(nickname,password,sex,rdate)values('jiang','jiang0228','M','2012-09-04');

-> SETi = i + 1;

->endwhile;

->end;

-> //

mysql>\d ;

创建1000万条数据:

mysql> call jiang(10000000);

如果进行主键的查询,可以发现分区的确是有意义的:

mysql> explain partitions select * from profilewhere id = 1\G

*************************** 1. row***************************

id: 1

select_type: SIMPLE

table: profile

partitions: p1

type: const

possible_keys: PRIMARY

key: PRIMARY

key_len: 4

ref: const

rows: 1

Extra:

1 row in set (0.01 sec)

可以发现只寻找了p1分区,但是对于表profile中nickname列索引的查询,explain partitions则会得到如下的结果:

mysql> explain partitions select * from profilewhere nickname = 'david'\G

*************************** 1. row***************************

id: 1

select_type: SIMPLE

table: profile

partitions: p0,p1,p2,p3,p4,p5,p6,p7,p8,p9

type: ref

possible_keys: nickname

key: nickname

key_len: 62

ref: const

rows: 10

Extra: Using where

mysql> update profile set nickname = 'david' where id = 5566;

mysql> select * from profile where nickname = 'david';

上述查询实际的IO执行了20-30次。

可以看到,MySQL数据库会搜索所有的分区,因此查询速度会慢很多。

最后提示:分区并不总是适合于OLTP应用,应该根据自己的应用好好规划自己的分区设计。