博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
mysql笔记 - SELECT 语句
阅读量:6849 次
发布时间:2019-06-26

本文共 19022 字,大约阅读时间需要 63 分钟。

mysql笔记 - SELECT 语句

SELECT     [ALL | DISTINCT | DISTINCTROW ] *       [HIGH_PRIORITY]       [STRAIGHT_JOIN]       [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]       [SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]     select_expr [, select_expr ...]     [FROM table_references     [WHERE where_condition]     [GROUP BY {col_name | expr | position}       [ASC | DESC], ... [WITH ROLLUP]]     [HAVING where_condition]     [ORDER BY {col_name | expr | position}       [ASC | DESC], ...]     [LIMIT {[offset,] row_count | row_count OFFSET offset}]     [PROCEDURE procedure_name(argument_list)]     [INTO OUTFILE 'file_name'         [CHARACTER SET charset_name]         export_options       | INTO DUMPFILE 'file_name'       | INTO var_name [, var_name]]     [FOR UPDATE | LOCK IN SHARE MODE]] select * from employees; -- 取出所有数据 desc employees; -- 查看表信息

通常会要求开发人员不要使用select * 即使是select所有字段推荐写上所有的列名,虽然很麻烦但是规范上推荐这样做

一个重要的原因是alter table会对表结构进行修改的很多操作,alter table后select * from取出的字段可能比一开始涉及的要多,可能会遇到各种各样的问题。另外select alter table可以加列,可以加在某个字段中间的位置,那么程序那边取得的数据并不是一开始想要的数据,容易出错。所以建议select的时候建议把所有的列都带上

LIMIT 限制取出来的数据量

不加LIMIT限制会一次取出所有的数据,数据量大的时候可能是灾难性的事情,通常都会加一个LIMIT,图形化工具一般默认会加上一个LIMIT

select * from employees limit 0,3; -- 从第0条开始取出3条数据     select * from employees limit 1,3; -- 从第1条开始取出3条数据

LIMIT常用于分页,但是LIMIT offset数字越大性能越差,为什么呢?

select * from employees limit 30; -- 取30条,扫30条,查询快(0.00 sec) select * from employees limit 1000000,30; -- 同样取30条,扫1000030行记录,查询变很慢(14.55 sec)

好的做法

select emp_no,birth_date,first_name,last_name,gender from employees limit 10; +--------+------------+------------+-----------+--------+ | emp_no | birth_date | first_name | last_name | gender | +--------+------------+------------+-----------+--------+ |  10001 | 1953-09-02 | Georgi     | Facello   | M      | |  10002 | 1964-06-02 | Bezalel    | Simmel    | F      | |  10003 | 1959-12-03 | Parto      | Bamford   | M      | |  10004 | 1954-05-01 | Chirstian  | Koblick   | M      | |  10005 | 1955-01-21 | Kyoichi    | Maliniak  | M      | |  10006 | 1953-04-20 | Anneke     | Preusig   | F      | |  10007 | 1957-05-23 | Tzvetan    | Zielinski | F      | |  10008 | 1958-02-19 | Saniya     | Kalloufi  | M      | |  10009 | 1952-04-19 | Sumant     | Peac      | F      | |  10010 | 1963-06-01 | Duangkaew  | Piveteau  | F      | +--------+------------+------------+-----------+--------+  select emp_no,birth_date,first_name,last_name,gender from employees where emp_no > 10010 limit 10; +--------+------------+------------+-------------+--------+ | emp_no | birth_date | first_name | last_name   | gender | +--------+------------+------------+-------------+--------+ |  10011 | 1953-11-07 | Mary       | Sluis       | F      | |  10012 | 1960-10-04 | Patricio   | Bridgland   | M      | |  10013 | 1963-06-07 | Eberhardt  | Terkki      | M      | |  10014 | 1956-02-12 | Berni      | Genin       | M      | |  10015 | 1959-08-19 | Guoxiang   | Nooteboom   | M      | |  10016 | 1961-05-02 | Kazuhito   | Cappelletti | M      | |  10017 | 1958-07-06 | Cristinel  | Bouloucos   | F      | |  10018 | 1954-06-19 | Kazuhide   | Peha        | F      | |  10019 | 1953-01-23 | Lillian    | Haddadi     | M      | |  10020 | 1952-12-24 | Mayuko     | Warwick     | M      | +--------+------------+------------+-------------+--------+

从之前取得数据的最大值开始取数据,这样的一个好处是不管你的值多大,因为是通过大于等于来定位的,所以只是取了10条数据不会再去扫这么多的数据

下面的语句表示从mysql中随机取出3条数据,select语句不带order by表示随机取3条数据

select emp_no,first_name,last_name from employees limit 3; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  10001 | Georgi     | Facello   | |  10002 | Bezalel    | Simmel    | |  10003 | Parto      | Bamford   | +--------+------------+-----------+

数据量小的时候不容易看出来,数据量大的时候容易发现这个问题,或者alter table一下后再试

alter table employees add index idx_name(first_name,last_name); select emp_no,first_name,last_name from employees limit 3; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  69256 | Aamer      | Anger     | | 486584 | Aamer      | Armand    | | 237165 | Aamer      | Azevdeo   | +--------+------------+-----------+

order by 表示根据哪个字段进行排序,可以按照物理数据顺序取出数据

select emp_no,first_name,last_name from employees order by emp_no limit 3; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  10001 | Georgi     | Facello   | |  10002 | Bezalel    | Simmel    | |  10003 | Parto      | Bamford   | +--------+------------+-----------+

查询分区表

select * from titles partition (p19) limit 1; +--------+--------------+------------+------------+ | emp_no | title        | from_date  | to_date    | +--------+--------------+------------+------------+ |  10052 | Senior Staff | 2002-01-31 | 9999-01-01 | +--------+--------------+------------+------------+

ORDER BY 对取出来的数据做排序

order by根据某个字段进行分页

select emp_no,first_name,last_name from employees order by last_name limit 10; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  11761 | Bartek     | Aamodt    | |  15427 | Aluzio     | Aamodt    | |  18182 | Dekang     | Aamodt    | |  16572 | Matt       | Aamodt    | |  12791 | Mokhtar    | Aamodt    | |  12516 | Sreenivas  | Aamodt    | |  12982 | Sachem     | Aamodt    | |  17400 | Basim      | Aamodt    | |  19898 | Vidar      | Aamodt    | |  17885 | Takanari   | Aamodt    | +--------+------------+-----------+

WHERE 过滤数据

select emp_no,first_name,last_name from employees where emp_no = 10001; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  10001 | Georgi     | Facello   | +--------+------------+-----------+  select emp_no,first_name,last_name from employees where emp_no >= 10001 limit 10; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  10001 | Georgi     | Facello   | |  10002 | Bezalel    | Simmel    | |  10003 | Parto      | Bamford   | |  10004 | Chirstian  | Koblick   | |  10005 | Kyoichi    | Maliniak  | |  10006 | Anneke     | Preusig   | |  10007 | Tzvetan    | Zielinski | |  10008 | Saniya     | Kalloufi  | |  10009 | Sumant     | Peac      | |  10010 | Duangkaew  | Piveteau  | +--------+------------+-----------+  select emp_no,first_name,last_name from employees where emp_no >= 20001 and first_name like 'Am%' limit 10; +--------+------------+---------------+ | emp_no | first_name | last_name     | +--------+------------+---------------+ | 100860 | Amabile    | Aamodt        | | 285669 | Amabile    | Akiyama       | | 276002 | Amabile    | Albarhamtoshy | | 454340 | Amabile    | Alencar       | |  86625 | Amabile    | Anger         | | 416143 | Amabile    | Antonisse     | | 491486 | Amabile    | Antonisse     | | 451988 | Amabile    | Apsitis       | | 409363 | Amabile    | Atchley       | | 208844 | Amabile    | Baar          | +--------+------------+---------------+  select emp_no,first_name,last_name from employees where emp_no >= 20001 and first_name like '%Am%' limit 10; +--------+------------+--------------+ | emp_no | first_name | last_name    | +--------+------------+--------------+ |  20044 | Kiam       | Gist         | |  20062 | Uinam      | Heuser       | |  20114 | Ramalingam | Zyda         | |  20118 | Mohammed   | Schneeberger | |  20142 | Arumugam   | Emmart       | |  20159 | Isamu      | Valiente     | |  20167 | Stamatina  | Kobara       | |  20217 | Tamiya     | Ruemmler     | |  20265 | Amalendu   | Willoner     | |  20285 | Pramod     | Escriba      | +--------+------------+--------------+  select emp_no,first_name,last_name from employees where emp_no >= 20001 or first_name like '%Am%' limit 10; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  69256 | Aamer      | Anger     | | 486584 | Aamer      | Armand    | | 237165 | Aamer      | Azevdeo   | | 413688 | Aamer      | Azuma     | | 281363 | Aamer      | Baak      | | 242368 | Aamer      | Baaleh    | | 206549 | Aamer      | Baar      | | 259089 | Aamer      | Baba      | |  60922 | Aamer      | Bahl      | | 283280 | Aamer      | Bahl      | +--------+------------+-----------+

where 1 = 1表示没有任何条件全部成立,因为不知道第一个where应该怎么写,如果提前把where 1 = 1写好,这样程序那边拼接后面的条件就比较方便

select emp_no,first_name,last_name from employees where 1 = 1 limit 10; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  69256 | Aamer      | Anger     | | 486584 | Aamer      | Armand    | | 237165 | Aamer      | Azevdeo   | | 413688 | Aamer      | Azuma     | | 281363 | Aamer      | Baak      | | 242368 | Aamer      | Baaleh    | | 206549 | Aamer      | Baar      | | 259089 | Aamer      | Baba      | |  60922 | Aamer      | Bahl      | | 283280 | Aamer      | Bahl      | +--------+------------+-----------+

比如

select emp_no,first_name,last_name from employees where 1 = 1 and emp_no = 20000; +--------+------------+-----------+ | emp_no | first_name | last_name | +--------+------------+-----------+ |  20000 | Jenwei     | Matzke    | +--------+------------+-----------+

GROUP BY

clipboard.png

分组就是根据某个字段当中相同的值来进行分组,并且分组完之后是要加一个聚集(统计)函数

use test; create table t (a int auto_increment primary key, b int, c int); insert into t select NULL,1,1; insert into t select NULL,1,7; insert into t select NULL,1,8; insert into t select NULL,2,8; insert into t select NULL,2,10; insert into t select NULL,3,8; select * from t; +---+------+------+ | a | b    | c    | +---+------+------+ | 1 |    1 |    1 | | 2 |    1 |    7 | | 3 |    1 |    8 | | 4 |    2 |    8 | | 5 |    2 |   10 | | 6 |    3 |    8 | +---+------+------+  select b,sum(b),avg(b) from t group by b; +------+--------+--------+ | b    | sum(b) | avg(b) | +------+--------+--------+ |    1 |      3 | 1.0000 | |    2 |      4 | 2.0000 | |    3 |      3 | 3.0000 | +------+--------+--------+

HAVING

HAVING表示对group by中的聚合函数进行过滤

select b,sum(b),avg(b) from t where avb(b)>2 group by b; -- 这样是会报错的 ERROR 1305 (42000): FUNCTION test.avb does not exist

分组中需要对分组的条件进行过滤需要使用HAVING

select b,sum(b),avg(b) from t group by b having avg(b) > 2; +------+--------+--------+ | b    | sum(b) | avg(b) | +------+--------+--------+ |    3 |      3 | 3.0000 | +------+--------+--------+

不是说使用了group by就代表不能使用where了,这样也是可以使用where进行过滤的,只不过where过滤的是非聚合的结果(对查询的记录进行过滤),而having是用来过滤聚合的结果

select b,sum(b),avg(b) from t where b < 3 group by b; +------+--------+--------+ | b    | sum(b) | avg(b) | +------+--------+--------+ |    1 |      3 | 1.0000 | |    2 |      4 | 2.0000 | +------+--------+--------+

双重过滤

select b,sum(b),avg(b) from t where b < 3 group by b having avg(b) < 2; +------+--------+--------+ | b    | sum(b) | avg(b) | +------+--------+--------+ |    1 |      3 | 1.0000 | +------+--------+--------+

所以where是一开始选数据的时候就开始过滤了,having是在聚集结果出来之后才开始过滤

clipboard.png

having也可以过滤指定的一列而不是聚集函数,但是这样写没有什么意义,使用where可以获得更好的性能,所以having一般都跟聚集函数搭配使用

select b,sum(b),avg(b) from t where b < 3 group by b having b < 2; +------+--------+--------+ | b    | sum(b) | avg(b) | +------+--------+--------+ |    1 |      3 | 1.0000 | +------+--------+--------+

select字段中没有选择的字段,having中使用会报错

select b,sum(b),avg(b) from t where b < 3 group by b having a < 2; ERROR 1054 (42S22): Unknown column 'a' in 'having clause'

这样的语句没有意义,因为取出来的a字段数据不一定是146,mysql会随机取a字段的数据,而且这种写法在orcal中好像不支持

select a,b,sum(b) from t group by b; +---+------+--------+ | a | b    | sum(b) | +---+------+--------+ | 1 |    1 |      3 | | 4 |    2 |      4 | | 6 |    3 |      3 | +---+------+--------+

JOIN

基本的多表关联查询

SELECT     concat(first_name,' ',last_name), dept_name -- concat连接了2个字段的字符串 FROM     employees e, -- e是别名,下面3个雷同     dept_emp de,     departments d WHERE     e.emp_no = de.emp_no -- 关联条件         AND de.dept_no = d.dept_no -- 再一次关联条件 LIMIT 3; +----------------------------------+------------------+ | concat(first_name,' ',last_name) | dept_name        | +----------------------------------+------------------+ | Mary Sluis                       | Customer Service | | Huan Lortz                       | Customer Service | | Basil Tramer                     | Customer Service | +----------------------------------+------------------+

新建一些例子

create table a (a int); create table b (a int); insert into a select 1; insert into a select 2; insert into a select 3;  insert into b select 1; insert into b select 2; select * from a;select * from b; +------+ | a    | +------+ |    1 | |    2 | |    3 | +------+ 3 rows in set (0.00 sec)  +------+ | a    | +------+ |    1 | |    2 | +------+ 2 rows in set (0.03 sec)

最简单的两表关联,并不是2张表关联就会做笛卡尔积,如果关联的时候没有写where关联条件就会产生笛卡尔积

select * from a,b where a.a = b.a; +------+------+ | a    | a    | +------+------+ |    1 |    1 | |    2 |    2 | +------+------+  select * from a,b; -- 这样会产生笛卡尔积 +------+------+ | a    | a    | +------+------+ |    1 |    1 | |    1 |    2 | |    2 |    1 | |    2 |    2 | |    3 |    1 | |    3 |    2 | +------+------+

三张表只有1个过滤条件会出现图示效果,会产生笛卡尔积的效果,出现了4条记录

create table c (a int); insert into c select 2; insert into c select 3; select * from a,b,c where a.a = b.a; -- 这样会产生笛卡尔积 +------+------+------+ | a    | a    | a    | +------+------+------+ |    1 |    1 |    2 | |    2 |    2 |    2 | |    1 |    1 |    3 | |    2 |    2 |    3 | +------+------+------+

还可以这样写,关联条件可以非等值

select * from a,b where a.a < b.a; +------+------+ | a    | a    | +------+------+ |    1 |    2 | +------+------+

修改一下字段名先。。。

alter table a change a x int; alter table b change a y int; alter table c change a z int;

INNER JOIN

这3个写法没有区别

select * from a inner join b on a.x = b.y; select * from a join b on a.x = b.y; select * from a,b where a.x = b.y; -- 都返回 +------+------+ | x    | y    | +------+------+ |    1 |    1 | |    2 |    2 | +------+------+

以下语句的关联,通常来说会选择关联过滤度高的条件进行关联,这里b.y > 1的关联度会高一些。

select * from a,b where a.x = b.y and b.y > 1; +------+------+ | x    | y    | +------+------+ |    2 |    2 | +------+------+

假设a表100w记录b表100w记录,其中的x,y一一对应,优化机通畅都会先从b>100 and b<200条件先进行过滤,这样就只需要100w100条记录进行join就可以了,如果先从a.x = b.y进行过滤,那就是100w100w的记录进行join,所以一般优化机是会把b>100 and b<200的过滤条件设置得更高

clipboard.png

INNER JOIN的时候过滤条件即可写在on里面也可以写在where条件里面,对结果来说是没有区别的

select * from a inner join b on (a.x = b.y and b.y > 1); select * from a inner join b on a.x = b.y where b.y > 1; 都返回 +------+------+ | x    | y    | +------+------+ |    2 |    2 | +------+------+

OUTER JOIN

outer join 分成left/right两种,outer可以关键词可以省略,left/right join代表左/右外连接

左连接中左边的表为保留表,保留表中所有的字段都是要出现的,如果关联条件存在的话就是一对一的情况,右表中关联条件不存在则关联出来的结果就是NULL值

clipboard.png

select * from a left join b on a.x = b.y; +------+------+ | x    | y    | +------+------+ |    1 |    1 | |    2 |    2 | |    3 | NULL | +------+------+ select * from a right join b on a.x = b.y; +------+------+ | x    | y    | +------+------+ |    1 |    1 | |    2 |    2 | +------+------+

a表中存在的但是b表中不存在的数据,左连接实现

select * from a left join b on a.x = b.y where b.y is NULL; +------+------+ | x    | y    | +------+------+ |    3 | NULL | +------+------+

a表中但是不在b表中,不使用left join 使用not in + 子查询,不过会存在一些性能上的问题

select * from a where a.x not in (select y from b); +------+ | x    | +------+ |    3 | +------+

这样的查询b.y会全部返回NULL,因为a.x = b.y and b.y is NULL不成立,所以b.y全部用NULL填充

select * from a left join b on a.x = b.y and b.y is NULL; +------+------+ | x    | y    | +------+------+ |    1 | NULL | |    2 | NULL | |    3 | NULL | +------+------+

过滤条件写在on后还是where后是有讲究的,推荐使用on来进行两张表之间的关联,where用来进行数据的过滤。

外连接中where不能写在on前面,内连接则可以,left join是用到比较多的一个方式

多次left join

select * from a left join b on a.x = b.y left join c on a.x = c.z; +------+------+------+ | x    | y    | z    | +------+------+------+ |    2 |    2 |    2 | |    1 |    1 | NULL | |    3 | NULL |    3 | +------+------+------+ select * from a left join b on a.x = b.y left join c on b.y = c.z order by a.x; +------+------+------+ | x    | y    | z    | +------+------+------+ |    1 |    1 | NULL | |    2 |    2 |    2 | |    3 | NULL | NULL | +------+------+------+

clipboard.png

第二次join使用等值连接尝试

select * from a left join b on a.x = b.y inner join c on b.y = c.z order by a.x; +------+------+------+ | x    | y    | z    | +------+------+------+ |    2 |    2 |    2 | +------+------+------+

一些例子

查询不是经理的员工

SELECT      e.emp_no FROM     employees e         LEFT JOIN     dept_manager dm ON e.emp_no = dm.emp_no WHERE     dm.emp_no IS NULL LIMIT 3;

分页技巧

使用where而不是使用limit 10000,10 因为limit会扫10000前面的1w条数据

select * from employees where emp_no > 10000 limit 10;

根据生日分页

alter table employees add index idx_birth(birth_date,emp_no); -- 建立联合索引  select * from employees where birth_date > '1952-02-02' order by birth_date limit 10; +--------+------------+------------+-----------+--------+------------+ | emp_no | birth_date | first_name | last_name | gender | hire_date  | +--------+------------+------------+-----------+--------+------------+ |  16093 | 1952-02-03 | Luise      | Tramer    | M      | 1992-02-28 | |  16447 | 1952-02-03 | Zhiguo     | Savasere  | F      | 1987-04-28 | |  23857 | 1952-02-03 | Kristen    | Frijda    | M      | 1986-05-31 | |  27259 | 1952-02-03 | Michaela   | Lipner    | M      | 1997-09-05 | |  32094 | 1952-02-03 | Mohd       | Buchter   | F      | 1986-03-16 | |  32641 | 1952-02-03 | Tua        | Tetzlaff  | F      | 1988-12-01 | |  41374 | 1952-02-03 | JiYoung    | Schurmann | M      | 1988-01-25 | |  43737 | 1952-02-03 | Debatosh   | Beerel    | F      | 1987-12-07 | |  56225 | 1952-02-03 | Miquel     | Rusmann   | F      | 1990-06-17 | |  59577 | 1952-02-03 | Adel       | Swiler    | F      | 1986-11-27 | +--------+------------+------------+-----------+--------+------------+ -- 错误的结果,日期直接跳到了02-03  select * from employees where (birth_date,emp_no) > ('1952-02-02', 10010) order by birth_date limit 10; +--------+------------+------------+-------------+--------+------------+ | emp_no | birth_date | first_name | last_name   | gender | hire_date  | +--------+------------+------------+-------------+--------+------------+ |  12282 | 1952-02-02 | Tadahiro   | Delgrange   | M      | 1997-01-09 | |  13944 | 1952-02-02 | Takahito   | Maierhofer  | M      | 1989-01-18 | |  22614 | 1952-02-02 | Dung       | Madeira     | M      | 1989-01-24 | |  29456 | 1952-02-02 | Barun      | Krohm       | F      | 1992-11-23 | |  33131 | 1952-02-02 | Reinhold   | Savasere    | M      | 1998-01-30 | |  40660 | 1952-02-02 | Piyush     | Erbe        | F      | 1988-04-04 | |  48910 | 1952-02-02 | Zhongwei   | DuBourdieux | M      | 1999-12-19 | |  51486 | 1952-02-02 | Jianwen    | Sigstam     | F      | 1989-07-20 | |  59884 | 1952-02-02 | Fan        | Przulj      | M      | 1991-09-25 | |  61382 | 1952-02-02 | Kristof    | Ranft       | M      | 1989-04-21 | +--------+------------+------------+-------------+--------+------------+ -- 正确的结果 select o_orderkey,o_orderstatus,o_totalprice from orders where(o_orderdate,o_orderkey) > ('1992-01-01',88199)  order by o_orderdate limit 10; select o_orderkey,o_orderstatus,o_totalprice from orders order by o_orderdate limit 10,10; -- 结果一致

当然分页推荐使用redis来做,用数据库实现其实意义不大

转载地址:http://mclul.baihongyu.com/

你可能感兴趣的文章