本文共 5224 字,大约阅读时间需要 17 分钟。
由于业务需要,需实现如下功能的sql语句:查询出一个表T中每个不同id值所对应的最大val值。
---构造环境。 SQL> insert into t values (1,2,1); 已创建 1 行。 SQL> insert into t values (1,2,3); 已创建 1 行。 SQL> insert into t values (1,3,4); 已创建 1 行。 SQL> insert into t values (1,3,5); 已创建 1 行。 SQL> insert into t values (2,3,5); 已创建 1 行。 SQL> insert into t values (2,3,4); 已创建 1 行。 SQL> insert into t values (2,3,6); 已创建 1 行 SQL> insert into t values (2,3,56); 已创建 1 行。 SQL> commit; 提交完成。 SQL> select * from t; ID STATE VAL ---------- ---------- ---------- 1 2 1 1 2 3 1 3 4 1 3 5 2 3 5 2 3 4 2 3 6 2 3 56 已选择8行。-----------------------------------------
SQL> select id,state ,val 2 from (select id ,state,val, row_number() 3 over (partition by id order by val desc ) rn from t ) 4 where rn=1; ID STATE VAL ---------- ---------- ---------- 1 3 5 2 3 56 ---结果符合要求-----查看使用分析函数的执行计划。
SQL> set autot on SQL> select id,state ,val 2 from (select id ,state,val, row_number() 3 over (partition by id order by val desc ) rn from t ) 4 where rn=1;ID STATE VAL
---------- ---------- ---------- 1 3 5 2 3 56 ----查询出 id 为1对应的最大值 5 ,id 为2 时对应的最大值56 ----使用分析函数的执行计划 执行计划 ---------------------------------------------------------- Plan hash value: 3047187157 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 8 | 368 | 4 (25)| 00:00:01 | |* 1 | VIEW | | 8 | 368 | 4 (25)| 00:00:01 | |* 2 | WINDOW SORT PUSHED RANK| | 8 | 264 | 4 (25)| 00:00:01 | | 3 | TABLE ACCESS FULL | T | 8 | 264 | 3 (0)| 00:00:01 |----注意这里使用了 WINDOW SORT PUSHED RANK,只进行了一次FTS
-------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN"=1) 2 - filter(ROW_NUMBER() OVER ( PARTITION BY "ID" ORDER BY INTERNAL_FUNCTION("VAL") DESC )<=1) Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 76 recursive calls 0 db block gets 21 consistent gets 0 physical reads ....... 3 sorts (memory) 0 sorts (disk) 2 rows processed SQL> set linesize 999 SQL> / ID STATE VAL ---------- ---------- ---------- 1 3 5 2 3 56执行计划
---------------------------------------------------------- Plan hash value: 3047187157 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 8 | 368 | 4 (25)| 00:00:01 | |* 1 | VIEW | | 8 | 368 | 4 (25)| 00:00:01 | |* 2 | WINDOW SORT PUSHED RANK| | 8 | 264 | 4 (25)| 00:00:01 | | 3 | TABLE ACCESS FULL | T | 8 | 264 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN"=1) 2 - filter(ROW_NUMBER() OVER ( PARTITION BY "ID" ORDER BY INTERNAL_FUNCTION("VAL") DESC )<=1) Note ---- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 0 recursive calls 0 db block gets 7 consistent gets -------逻辑读为 7 0 physical reads 0 redo size 568 bytes sent via SQL*Net to client 416 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 2 rows processed ---没有使用分析函数的执行计划,可以看出此计划执行了两次全表扫描(见下面的例子),而使用了分析函数的情况下,只是扫描了一次!---显然当数据量巨大时,两种方式的资源消耗差别是巨大的。SQL> select id,state ,val from 2 t a where a.val in 3 (select max(b.val) from t b 4 where b.id =a.id group by id ); ---在业务处理时应该慎用关联子查询!ID STATE VAL
---------- ---------- ---------- 1 3 5 2 3 56 执行计划 ---------------------------------------------------------- Plan hash value: 2845706984 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 33 | 6 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | T | 8 | 264 | 3 (0)| 00:00:01 | |* 3 | FILTER | | | | | | | 4 | SORT GROUP BY NOSORT| | 1 | 26 | 3 (0)| 00:00:01||* 5 | TABLE ACCESS FULL | T | 1 | 26 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT 0 FROM "T" "B" WHERE "B"."ID"=:B1 GROUP BY "ID" HAVING MAX("B"."VAL")=:B2)) 3 - filter(MAX("B"."VAL")=:B1) 5 - filter("B"."ID"=:B1) Note ----- - dynamic sampling used for this statement 统计信息 ---------------------------------------------------------- 296 recursive calls 0 db block gets 203 consistent gets ---逻辑读为 203 是使用分析函数的30倍! 0 physical reads 11 sorts (memory) 0 sorts (disk) 2 rows processed总结:相关子查询对于外部查询的每一个值都会有一个结果与其对应,其计算的过程是这样的:
1.扫描外查询的第一条记录 2.扫描子查询,并将第一条记录的对应值传给子查询,由此计算出子查询的结果 3.根据子查询的结果,返回外查询的结果。 4.重复上述动作,开始扫描外查询的第二条记录,第三条记录,直至全部扫描完毕。所以,当使用分析函数能够满足业务需求时,最好使用分析函数。处理大量业务数据时,相关子查询是相当耗费资源的。慎用!
转载地址:http://nqkum.baihongyu.com/