Oracle10gR2分析函数
Oracle中分析函数用法小结
Oracle中分析函数用法小结一.分析函数适用场景:○1需要对同样的数据进行不同级别的聚合操作○2需要在表内将多条数据和同一条数据进行多次的比较○3需要在排序完的结果集上进行额外的过滤操作二.分析函数语法:FUNCTION_NAME(<argument>,<argument>...)OVER(<Partition-Clause><Order-by-Clause><Windowing Clause>)例:sum(sal) over (partition by deptno order by ename) new_aliassum就是函数名(sal)是分析函数的参数,每个函数有0~3个参数,参数可以是表达式,例如:sum(sal+comm) over 是一个关键字,用于标识分析函数,否则查询分析器不能区别sum()聚集函数和sum()分析函数partition by deptno 是可选的分区子句,如果不存在任何分区子句,则全部的结果集可看作一个单一的大区order by ename 是可选的order by 子句,有些函数需要它,有些则不需要.依靠已排序数据的那些函数,如:用于访问结果集中前一行和后一行的LAG和LEAD,必须使用,其它函数,如AVG,则不需要.在使用了任何排序的开窗函数时,该子句是强制性的,它指定了在计算分析函数时一组内的数据是如何排序的.1)FUNCTION子句ORACLE提供了26个分析函数,按功能分5类分析函数分类等级(ranking)函数:用于寻找前N种查询开窗(windowing)函数:用于计算不同的累计,如SUM,COUNT,AVG,MIN,MAX等,作用于数据的一个窗口上例:sum(t.sal) over (order by t.deptno,t.ename) running_total,sum(t.sal) over (partition by t.deptno order by t.ename) department_total制表(reporting)函数:与开窗函数同名,作用于一个分区或一组上的所有列例:sum(t.sal) over () running_total2,sum(t.sal) over (partition by t.deptno) department_total2制表函数与开窗函数的关键不同之处在于OVER语句上缺少一个ORDER BY子句!LAG,LEAD函数:这类函数允许在结果集中向前或向后检索值,为了避免数据的自连接,它们是非常有用的.VAR_POP,VAR_SAMP,STDEV_POPE及线性的衰减函数:计算任何未排序分区的统计值2)PARTITION子句按照表达式分区(就是分组),如果省略了分区子句,则全部的结果集被看作是一个单一的组3)ORDER BY子句分析函数中ORDER BY的存在将添加一个默认的开窗子句,这意味着计算中所使用的行的集合是当前分区中当前行和前面所有行,没有ORDER BY时,默认的窗口是全部的分区在Order by 子句后可以添加nulls last,如:order by comm desc nulls last 表示排序时忽略comm列为空的行.4)WINDOWING子句用于定义分析函数将在其上操作的行的集合Windowing子句给出了一个定义变化或固定的数据窗口的方法,分析函数将对这些数据进行操作默认的窗口是一个固定的窗口,仅仅在一组的第一行开始,一直继续到当前行,要使用窗口,必须使用ORDER BY子句根据2个标准可以建立窗口:数据值的范围(RANGES)或与当前行的行偏移量.5)Rang窗口Range 5 preceding:将产生一个滑动窗口,他在组中拥有当前行以前5行的集合ANGE窗口仅对NUMBERS和DATES起作用,因为不可能从VARCHAR2中增加或减去N个单元另外的限制是ORDER BY中只能有一列,因而范围实际上是一维的,不能在N维空间中例:avg(t.sal) over(order by t.hiredate asc range 100 preceding) 统计前100天平均工资6)Row窗口利用ROW分区,就没有RANGE分区那样的限制了,数据可以是任何类型,且ORDER BY 可以包括很多列7)Specifying窗口UNBOUNDED PRECEDING:这个窗口从当前分区的每一行开始,并结束于正在处理的当前行CURRENT ROW:该窗口从当前行开始(并结束)Numeric Expression PRECEDING:对该窗口从当前行之前的数字表达式(Numeric Expression)的行开始,对RANGE来说,从行序值小于数字表达式的当前行的值开始. Numeric Expression FOLLOWING:该窗口在当前行Numeric Expression行之后的行终止(或开始),且从行序值大于当前行Numeric Expression行的范围开始(或终止)range between 100 preceding and 100 following:当前行100前, 当前行100后注意:分析函数允许你对一个数据集进排序和筛选,这是SQL从来不能实现的.除了最后的Order by子句之外,分析函数是在查询中执行的最后的操作集,这样的话,就不能直接在谓词中使用分析函数,即不能在上面使用where或having子句!!!下面我们通过一个实际的例子:按区域查找上一年度订单总额占区域订单总额20%以上的客户,来看看分析函数的应用。
ORACLE_分析函数大全
ORACLE_分析函数大全Oracle分析函数是一种高级SQL函数,它可以在查询中实现一系列复杂的分析操作。
这些函数可以帮助我们在数据库中执行各种数据分析和报表生成任务。
本文将介绍Oracle数据库中的一些常用分析函数。
1.ROW_NUMBER函数:该函数为查询结果中的每一行分配一个唯一的数字。
可以用它对结果进行排序或分组。
例如,可以使用ROW_NUMBER函数在结果集中为每个员工计算唯一的编号。
2.RANK和DENSE_RANK函数:这两个函数用于计算结果集中每个行的排名。
RANK函数返回相同值的行具有相同的排名,并且下一个排名值将被跳过。
DENSE_RANK函数类似,但是下一个排名值不会被跳过。
G和LEAD函数:LAG函数返回结果集中指定列的前一个(上一个)行的值,而LEAD函数返回后一个(下一个)行的值。
这些函数通常用于计算增长率或发现趋势。
4.FIRST和LAST函数:这两个函数用于返回结果集中分组的第一个和最后一个行的值。
可以与GROUPBY子句一起使用。
5.CUME_DIST函数:该函数用于计算给定值的累积分布。
它返回值的累积分布在结果集中的位置(百分比)。
6.PERCENT_RANK函数:该函数用于计算结果集中每个行的百分位数排名。
它返回值的百分位数排名(0到1之间的小数)。
7. NTILE函数:该函数用于将结果集分成指定数量的桶(Bucket),并为每个行分配一个桶号。
通常用于将数据分组为更小的块。
8.LISTAGG函数:该函数将指定列的值连接成一个字符串,并使用指定的分隔符分隔每个值。
可以用它将多个值合并在一起形成一个字符串。
9.AVG、SUM、COUNT和MAX/MIN函数:这些是常见的聚合函数,可以在分析函数中使用。
它们用于计算结果集中的平均值、总和、计数和最大/最小值。
以上只是Oracle数据库中的一些常用分析函数。
还有其他一些分析函数,如PERCENTILE_CONT、PERCENTILE_DISC等可以用于更高级的分析计算。
Oracle 分析函数的使用
Oracle 分析函数的使用Oracle 分析函数使用介绍分析函数是oracle816引入的一个全新的概念,为我们分析数据提供了一种简单高效的处理方式.在分析函数出现以前,我们必须使用自联查询,子查询或者内联视图,甚至复杂的存储过程实现的语句,现在只要一条简单的sql语句就可以实现了,而且在执行效率方面也有相当大的提高.下面我将针对分析函数做一些具体的说明.今天我主要给大家介绍一下以下几个函数的使用方法1. 自动汇总函数rollup,cube,2. rank 函数, rank,dense_rank,row_number3. lag,lead函数4. sum,avg,的移动增加,移动平均数5. ratio_to_report报表处理函数6. first,last取基数的分析函数基础数据Code: [Copy to clipboard]06:34:23 SQL> select * from t;BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE--------------- ---------- ---------- --------------200405 5761 G 7393344.04200405 5761 J 5667089.85200405 5762 G 6315075.96200405 5762 J 6328716.15200405 5763 G 8861742.59200405 5763 J 7788036.32200405 5764 G 6028670.45200405 5764 J 6459121.49200405 5765 G 13156065.77200405 5765 J 11901671.70200406 5761 G 7614587.96200406 5761 J 5704343.05200406 5762 G 6556992.60200406 5762 J 6238068.05200406 5763 G 9130055.46200406 5763 J 7990460.25200406 5764 G 6387706.01200406 5764 J 6907481.66200406 5765 G 13562968.81200406 5765 J 12495492.50200407 5761 G 7987050.65200407 5761 J 5723215.28200407 5762 G 6833096.68200407 5762 J 6391201.44200407 5763 G 9410815.91200407 5763 J 8076677.41200407 5764 G 6456433.23200407 5764 J 6987660.53200407 5765 G 14000101.20200407 5765 J 12301780.20200408 5761 G 8085170.84200408 5761 J 6050611.37200408 5762 G 6854584.22200408 5762 J 6521884.50200408 5763 G 9468707.65200408 5763 J 8460049.43200408 5764 G 6587559.23BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE --------------- ---------- ---------- --------------200408 5764 J 7342135.86200408 5765 G 14450586.63200408 5765 J 12680052.3840 rows selected.Elapsed: 00:00:00.001. 使用rollup函数的介绍Quote:下面是直接使用普通sql语句求出各地区的汇总数据的例子06:41:36 SQL> set autot on06:43:36 SQL> select area_code,sum(local_fare) local_fare06:43:50 2 from t06:43:51 3 group by area_code06:43:57 4 union all06:44:00 5 select '合计' area_code,sum(local_fare) local_fare06:44:06 6 from t06:44:08 7 /AREA_CODE LOCAL_FARE---------- --------------5761 54225413.045762 52039619.605763 69186545.025764 53156768.465765 104548719.19合计 333157065.316 rows selected.Elapsed: 00:00:00.03Execution Plan----------------------------------------------------------0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=1310 Bytes=24884)1 0 UNION-ALL2 1 SORT (GROUP BY) (Cost=5 Card=1309 Bytes=24871)3 2 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309Bytes=24871)4 1 SORT (AGGREGATE)5 4 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309Bytes=17017)Statistics----------------------------------------------------------0 recursive calls0 db block gets6 consistent gets0 physical reads0 redo size561 bytes sent via SQL*Net to client503 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client1 sorts (memory)0 sorts (disk)6 rows processed下面是使用分析函数rollup得出的汇总数据的例子06:44:09 SQL> select nvl(area_code,'合计') area_code,sum(local_fare) local_fare06:45:26 2 from t06:45:30 3 group by rollup(nvl(area_code,'合计'))06:45:50 4 /AREA_CODE LOCAL_FARE---------- --------------5761 54225413.045762 52039619.605763 69186545.025764 53156768.465765 104548719.19333157065.316 rows selected.Elapsed: 00:00:00.00Execution Plan----------------------------------------------------------0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1309Bytes=24871)1 0 SORT (GROUP BY ROLLUP) (Cost=5 Card=1309 Bytes=24871)2 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309Bytes=24871)Statistics----------------------------------------------------------0 recursive calls0 db block gets4 consistent gets0 physical reads0 redo size557 bytes sent via SQL*Net to client503 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client1 sorts (memory)0 sorts (disk)6 rows processed从上面的例子我们不难看出使用rollup函数,系统的sql语句更加简单,耗用的资源更少,从6个consistent gets降到4个consistent gets,如果基表很大的话,结果就可想而知了.1. 使用cube函数的介绍Quote:为了介绍cube函数我们再来看看另外一个使用rollup的例子06:53:00 SQL> select area_code,bill_month,sum(local_fare) local_fare06:53:37 2 from t06:53:38 3 group by rollup(area_code,bill_month)06:53:49 4 /---------- --------------- --------------5761 200405 13060433.895761 200406 13318931.015761 200407 13710265.935761 200408 14135782.215761 54225413.045762 200405 12643792.115762 200406 12795060.655762 200407 13224298.125762 200408 13376468.725762 52039619.605763 200405 16649778.915763 200406 17120515.715763 200407 17487493.325763 200408 17928757.085763 69186545.025764 200405 12487791.945764 200406 13295187.675764 200407 13444093.765764 200408 13929695.095764 53156768.465765 200405 25057737.475765 200406 26058461.315765 200407 26301881.405765 200408 27130639.015765 104548719.19333157065.3126 rows selected.Elapsed: 00:00:00.00系统只是根据rollup的第一个参数area_code对结果集的数据做了汇总处理,而没有对bill_month做汇总分析处理,cube函数就是为了这个而设计的.下面,让我们看看使用cube函数的结果06:58:02 SQL> select area_code,bill_month,sum(local_fare) local_fare 06:58:30 2 from t06:58:32 3 group by cube(area_code,bill_month)06:58:42 4 order by area_code,bill_month nulls last06:58:57 5 /---------- --------------- --------------5761 200405 13060.435761 200406 13318.935761 200407 13710.275761 200408 14135.785761 54225.415762 200405 12643.795762 200406 12795.065762 200407 13224.305762 200408 13376.475762 52039.625763 200405 16649.785763 200406 17120.525763 200407 17487.495763 200408 17928.765763 69186.545764 200405 12487.795764 200406 13295.195764 200407 13444.095764 200408 13929.695764 53156.775765 200405 25057.745765 200406 26058.465765 200407 26301.885765 200408 27130.645765 104548.72200405 79899.53200406 82588.15200407 84168.03200408 86501.34333157.0530 rows selected.Elapsed: 00:00:00.01可以看到,在cube函数的输出结果比使用rollup多出了几行统计数据.这就是cube函数根据bill_month做的汇总统计结果1 rollup 和cube函数的再深入Quote:从上面的结果中我们很容易发现,每个统计数据所对应的行都会出现null,我们如何来区分到底是根据那个字段做的汇总呢,这时候,oracle的grouping函数就粉墨登场了.如果当前的汇总记录是利用该字段得出的,grouping函数就会返回1,否则返回01 select decode(grouping(area_code),1,'all area',to_char(area_code)) area_code,2 decode(grouping(bill_month),1,'all month',bill_month) bill_month,3 sum(local_fare) local_fare4 from t5 group by cube(area_code,bill_month)6* order by area_code,bill_month nulls last07:07:29 SQL> /AREA_CODE BILL_MONTH LOCAL_FARE---------- --------------- --------------5761 200405 13060.435761 200406 13318.935761 200407 13710.275761 200408 14135.785761 all month 54225.415762 200405 12643.795762 200406 12795.065762 200407 13224.305762 200408 13376.475762 all month 52039.625763 200405 16649.785763 200406 17120.525763 200407 17487.495763 200408 17928.765763 all month 69186.545764 200405 12487.795764 200406 13295.195764 200407 13444.095764 200408 13929.695764 all month 53156.775765 200405 25057.745765 200406 26058.465765 200407 26301.885765 200408 27130.645765 all month 104548.72all area 200405 79899.53all area 200406 82588.15all area 200407 84168.03all area 200408 86501.34all area all month 333157.0530 rows selected.Elapsed: 00:00:00.0107:07:31 SQL>可以看到,所有的空值现在都根据grouping函数做出了很好的区分,这样利用rollup,cube和grouping函数,我们做数据统计的时候就可以轻松很多了.2. rank函数的介绍介绍完rollup和cube函数的使用,下面我们来看看rank系列函数的使用方法.问题2.我想查出这几个月份中各个地区的总话费的排名.Quote:为了将rank,dense_rank,row_number函数的差别显示出来,我们对已有的基础数据做一些修改,将5763的数据改成与5761的数据相同.1 update t t1 set local_fare = (2 select local_fare from t t23 where t1.bill_month = t2.bill_month4 and _type = _type5 and t2.area_code = '5761'6* ) where area_code = '5763'07:19:18 SQL> /8 rows updated.Elapsed: 00:00:00.01我们先使用rank函数来计算各个地区的话费排名.07:34:19 SQL> select area_code,sum(local_fare) local_fare,07:35:25 2 rank() over (order by sum(local_fare) desc) fare_rank07:35:44 3 from t07:35:45 4 group by area_codee07:35:50 507:35:52 SQL> select area_code,sum(local_fare) local_fare,07:36:02 2 rank() over (order by sum(local_fare) desc) fare_rank07:36:20 3 from t07:36:21 4 group by area_code07:36:25 5 /AREA_CODE LOCAL_FARE FARE_RANK---------- -------------- ----------5765 104548.72 15761 54225.41 25763 54225.41 25764 53156.77 45762 52039.62 5Elapsed: 00:00:00.01我们可以看到红色标注的地方出现了,跳位,排名3没有出现下面我们再看看dense_rank查询的结果.07:36:26 SQL> select area_code,sum(local_fare) local_fare,07:39:16 2 dense_rank() over (order by sum(local_fare) desc ) fare_rank 07:39:39 3 from t07:39:42 4 group by area_code07:39:46 5 /AREA_CODE LOCAL_FARE FARE_RANK---------- -------------- ----------5765 104548.72 15761 54225.41 25763 54225.41 25764 53156.77 3 这是这里出现了第三名5762 52039.62 4Elapsed: 00:00:00.00在这个例子中,出现了一个第三名,这就是rank和dense_rank的差别,rank如果出现两个相同的数据,那么后面的数据就会直接跳过这个排名,而dense_rank则不会,差别更大的是,row_number哪怕是两个数据完全相同,排名也会不一样,这个特性在我们想找出对应没个条件的唯一记录的时候又很大用处1 select area_code,sum(local_fare) local_fare,2 row_number() over (order by sum(local_fare) desc ) fare_rank3 from t4* group by area_code07:44:50 SQL> /AREA_CODE LOCAL_FARE FARE_RANK---------- -------------- ----------5765 104548.72 15761 54225.41 25763 54225.41 35764 53156.77 45762 52039.62 5在row_nubmer函数中,我们发现,哪怕sum(local_fare)完全相同,我们还是得到了不一样排名,我们可以利用这个特性剔除数据库中的重复记录.这个帖子中的几个例子是为了说明这三个函数的基本用法的. 下个帖子我们将详细介绍他们的一些用法.2. rank函数的介绍a. 取出数据库中最后入网的n个用户select user_id,tele_num,user_name,user_status,create_datefrom (select user_id,tele_num,user_name,user_status,create_date,rank() over (order by create_date desc) add_rankfrom user_info)where add_rank <= :n;b.根据object_name删除数据库中的重复记录create table t as select obj#,name from sys.obj$;再insert into t1 select * from t1 数次.delete from t1 where rowid in (select row_id from (select rowid row_id,row_number() over (partition by obj# order by rowid ) rn ) where rn <> 1);c. 取出各地区的话费收入在各个月份排名.SQL> select bill_month,area_code,sum(local_fare) local_fare,2 rank() over (partition by bill_month order by sum(local_fare) desc) area_rank3 from t4 group by bill_month,area_code5 /BILL_MONTH AREA_CODE LOCAL_FARE AREA_RANK--------------- --------------- -------------- ----------200405 5765 25057.74 1200405 5761 13060.43 2200405 5763 13060.43 2200405 5762 12643.79 4200405 5764 12487.79 5200406 5765 26058.46 1200406 5761 13318.93 2200406 5763 13318.93 2200406 5764 13295.19 4200406 5762 12795.06 5200407 5765 26301.88 1200407 5761 13710.27 2200407 5763 13710.27 2200407 5764 13444.09 4200407 5762 13224.30 5200408 5765 27130.64 1200408 5761 14135.78 2200408 5763 14135.78 2200408 5764 13929.69 4200408 5762 13376.47 520 rows selected.SQL>3. lag和lead函数介绍取出每个月的上个月和下个月的话费总额1 select area_code,bill_month, local_fare cur_local_fare,2 lag(local_fare,2,0) over (partition by area_code order by bill_month ) pre_local_fare,3 lag(local_fare,1,0) over (partition by area_code order by bill_month ) last_local_fare,4 lead(local_fare,1,0) over (partition by area_code order by bill_month )next_local_fare,5 lead(local_fare,2,0) over (partition by area_code order by bill_month )post_local_fare6 from (7 select area_code,bill_month,sum(local_fare) local_fare8 from t9 group by area_code,bill_month10* )SQL> /AREA_CODE BILL_MONTH CUR_LOCAL_FARE PRE_LOCAL_FARELAST_LOCAL_FARE NEXT_LOCAL_FARE POST_LOCAL_FARE--------- ---------- -------------- -------------- --------------- --------------- ---------------5761 200405 13060.433 0 0 13318.93 13710 .2655761 200406 13318.93 0 13060.433 13710.265 14 135.7815761 200407 13710.265 13060.433 13318.93 14135.7815761 200408 14135.781 13318.93 13710.265 05762 200405 12643.791 0 0 12795.06 13224 .2975762 200406 12795.06 0 12643.791 13224.297 13 376.4685762 200407 13224.297 12643.791 12795.06 13376.4685762 200408 13376.468 12795.06 13224.297 05763 200405 13060.433 0 0 13318.93 13710 .2655763 200406 13318.93 0 13060.433 13710.265 14 135.7815763 200407 13710.265 13060.433 13318.93 14135.7815763 200408 14135.781 13318.93 13710.265 05764 200405 12487.791 0 0 13295.187 1344 4.0935764 200406 13295.187 0 12487.791 13444.093 1 3929.6945764 200407 13444.093 12487.791 13295.187 13929.6945764 200408 13929.694 13295.187 13444.093 05765 200405 25057.736 0 0 26058.46 26301 .8815765 200406 26058.46 0 25057.736 26301.881 27 130.6385765 200407 26301.881 25057.736 26058.46 27130.6385765 200408 27130.638 26058.46 26301.881 020 rows selected.利用lag和lead函数,我们可以在同一行中显示前n行的数据,也可以显示后n行的数据.4. sum,avg,max,min移动计算数据介绍计算出各个连续3个月的通话费用的平均数1 select area_code,bill_month, local_fare,2 sum(local_fare)3 over ( partition by area_code4 order by to_number(bill_month)5 range between 1 preceding and 1 following ) "3month_sum",6 avg(local_fare)7 over ( partition by area_code8 order by to_number(bill_month)9 range between 1 preceding and 1 following ) "3month_avg",10 max(local_fare)11 over ( partition by area_code12 order by to_number(bill_month)13 range between 1 preceding and 1 following ) "3month_max",14 min(local_fare)15 over ( partition by area_code16 order by to_number(bill_month)17 range between 1 preceding and 1 following ) "3month_min"18 from (19 select area_code,bill_month,sum(local_fare) local_fare20 from t21 group by area_code,bill_month22* )SQL> /AREA_CODE BILL_MONTH LOCAL_FARE 3month_sum 3month_avg3month_max 3month_min--------- ---------- ---------------- ---------- ---------- ---------- ----------5761 200405 13060.433 26379.363 13189.6815 13318.93 13060.433 5761 200406 13318.930 40089.628 13363.2093 13710.265 13060.433 5761 200407 13710.265 41164.976 13721.6587 14135.781 13318.93 40089.628 = 13060.433 + 13318.930 + 13710.26513363.2093 = (13060.433 + 13318.930 + 13710.265) / 313710.265 = max(13060.433 + 13318.930 + 13710.265)13060.433 = min(13060.433 + 13318.930 + 13710.265)5761 200408 14135.781 27846.046 13923.023 14135.781 13710.265 5762 200405 12643.791 25438.851 12719.4255 12795.06 12643.791 5762 200406 12795.060 38663.148 12887.716 13224.297 12643.791 5762 200407 13224.297 39395.825 13131.9417 13376.468 12795.06 5762 200408 13376.468 26600.765 13300.3825 13376.468 13224.297 5763 200405 13060.433 26379.363 13189.6815 13318.93 13060.433 5763 200406 13318.930 40089.628 13363.2093 13710.265 13060.433 5763 200407 13710.265 41164.976 13721.6587 14135.781 13318.935763 200408 14135.781 27846.046 13923.023 14135.781 13710.265 5764 200405 12487.791 25782.978 12891.489 13295.187 12487.791 5764 200406 13295.187 39227.071 13075.6903 13444.093 12487.791 5764 200407 13444.093 40668.974 13556.3247 13929.694 13295.187 5764 200408 13929.694 27373.787 13686.8935 13929.694 13444.093 5765 200405 25057.736 51116.196 25558.098 26058.46 25057.736 5765 200406 26058.460 77418.077 25806.0257 26301.881 25057.736 5765 200407 26301.881 79490.979 26496.993 27130.638 26058.46 5765 200408 27130.638 53432.519 26716.2595 27130.638 26301.88120 rows selected.5. ratio_to_report函数的介绍Quote:1 select bill_month,area_code,sum(local_fare) local_fare,2 ratio_to_report(sum(local_fare)) over3 ( partition by bill_month ) area_pct4 from t5* group by bill_month,area_codeSQL> break on bill_month skip 1SQL> compute sum of local_fare on bill_monthSQL> compute sum of area_pct on bill_monthSQL> /BILL_MONTH AREA_CODE LOCAL_FARE AREA_PCT---------- --------- ---------------- ----------200405 5761 13060.433 .1711492795762 12643.791 .1656894315763 13060.433 .1711492795764 12487.791 .1636451435765 25057.736 .328366866********** ---------------- ----------sum 76310.184 1200406 5761 13318.930 .1690507725762 12795.060 .1624015425763 13318.930 .1690507725764 13295.187 .1687494145765 26058.460 .330747499********** ---------------- ----------sum 78786.567 1200407 5761 13710.265 .1705451975762 13224.297 .1645001275763 13710.265 .1705451975764 13444.093 .1672342215765 26301.881 .327175257********** ---------------- ----------sum 80390.801 1200408 5761 14135.781 .1709111475762 13376.468 .1617305395763 14135.781 .1709111475764 13929.694 .1684194165765 27130.638 .328027751********** ---------------- ----------sum 82708.362 120 rows selected.6 first,last函数使用介绍Quote:取出每月通话费最高和最低的两个用户.1 select bill_month,area_code,sum(local_fare) local_fare,2 first_value(area_code)3 over (order by sum(local_fare) desc4 rows unbounded preceding) firstval,5 first_value(area_code)6 over (order by sum(local_fare) asc7 rows unbounded preceding) lastval8 from t9 group by bill_month,area_code10* order by bill_monthSQL> /BILL_MONTH AREA_CODE LOCAL_FARE FIRSTVAL LASTVAL ---------- --------- ---------------- --------------- ---------------200405 5764 12487.791 5765 5764200405 5762 12643.791 5765 5764200405 5761 13060.433 5765 5764200405 5765 25057.736 5765 5764200405 5763 13060.433 5765 5764200406 5762 12795.060 5765 5764200406 5763 13318.930 5765 5764200406 5764 13295.187 **** ****200406 5765 26058.460 5765 5764200406 5761 13318.930 5765 5764200407 5762 13224.297 5765 5764200407 5765 26301.881 5765 5764200407 5761 13710.265 5765 5764200407 5763 13710.265 5765 5764200407 5764 13444.093 5765 5764200408 5762 13376.468 5765 5764200408 5764 13929.694 5765 5764200408 5761 14135.781 5765 5764200408 5765 27130.638 5765 5764200408 5763 14135.781 5765 576420 rows selected.。
常用Oracle分析函数详解
常⽤Oracle分析函数详解学习步骤:1. 拥有Oracle EBS demo 环境或者 PROD 环境2. copy以下代码进 PL/SQL3. 配合解释分析结果4. 如果⽹页有点乱请复制到TXT中查看/*假设⼀个经理代表了⼀个部门*/SELECT emp.full_name,emp.salary,emp.manager_id,row_number() over(PARTITION BY emp.manager_id ORDER BY emp.salary DESC) row_number_dept, --部门排⾏rownum row_number, --⾏号round((rownum + 1) / 4) page_number, --每4⾏⼀页ntile(2) over(ORDER BY emp.salary DESC) page_number_nt, --平均分成两类AVG(emp.salary) over(PARTITION BY emp.manager_id) avg_salary_department, --该部门薪⽔均值SUM(emp.salary) over(PARTITION BY emp.manager_id) sum_salary_department, --该部门薪⽔总额COUNT(emp.salary) over(PARTITION BY emp.manager_id) count_emp_department, --部门所有的员⼯dense_rank() over(PARTITION BY emp.manager_id ORDER BY emp.salary DESC) rank_salary_dept, --该⼈员的部门薪⽔排⾏dense_rank() over(ORDER BY emp.salary DESC) rank_salary_company, --该⼈员的全公司排⾏MIN(emp.salary) over(PARTITION BY emp.manager_id) min_salary_dept, --部门的最低薪⽔MIN(emp.salary) keep(dense_rank FIRST ORDER BY emp.salary) over(PARTITION BY emp.manager_id) min_salary_dept_first, --部门的最低薪⽔first_value(emp.salary) over(PARTITION BY emp.manager_id ORDER BY emp.salary) min_salary_dept_firstv, --部门的最低薪⽔MAX(emp.salary) over(PARTITION BY emp.manager_id) max_salary_dept, --部门的最⾼薪⽔MAX(emp.salary) keep(dense_rank LAST ORDER BY emp.salary) over(PARTITION BY emp.manager_id) max_salary_dept_last, --部门的最⾼薪⽔last_value(emp.salary) over(PARTITION BY emp.manager_id ORDER BY emp.salary) max_salary_dept_lastv, --部门的最⾼薪⽔lag(emp.full_name, 1, '00') over(ORDER BY emp.salary DESC) last_persion, --薪⽔在⾃⼰前⼀位的⼈lead(emp.full_name, 1, '00') over(ORDER BY emp.salary DESC) next_persion --薪⽔在⾃⼰后⼀位的⼈FROM fwk_tbx_employees empORDER BY emp.salary DESC1. 基本概念理解分析函数1. 顾名思义,分析函数是在主查询结果的基础上进⾏⼀定的分析,如分部门汇总,分部门求均值等等。
Oracle之分析函数
Oracle之分析函数⼀、分析函数 1、分析函数 分析函数是Oracle专门⽤于解决复杂报表统计需求的功能强⼤的函数,它可以在数据中进⾏分组然后计算基于组的某种统计值,并且每⼀组的每⼀⾏都可以返回⼀个统计值。
2、分析函数和聚合函数的区别 普通的聚合函数⽤group by分组,每个分组返回⼀个统计值,⽽分析函数采⽤partition by分组,并且每组每⾏都可以返回⼀个统计值。
3、分析函数的形式 分析函数带有⼀个开窗函数over(),包含分析⼦句。
分析⼦句⼜由下⾯三部分组成: partition by :分组⼦句,表⽰分析函数的计算范围,不同的组互不相⼲; ORDER BY:排序⼦句,表⽰分组后,组内的排序⽅式; ROWS/RANGE:窗⼝⼦句,是在分组(PARTITION BY)后,组内的⼦分组(也称窗⼝),此时分析函数的计算范围窗⼝,⽽不是PARTITON。
窗⼝有两种,ROWS和RANGE; 使⽤形式如下:OVER(PARTITION BY xxx PORDER BY yyy ROWS BETWEEN rowStart AND rowEnd) 注:窗⼝⼦句在这⾥我只说rows⽅式的窗⼝,range⽅式和滑动窗⼝也不提。
⼆、OVER() 函数 1、sql 查询语句的 order by 和 OVER() 函数中的 ORDER BY 的执⾏顺序 分析函数是在整个sql查询结束后(sql语句中的order by的执⾏⽐较特殊)再进⾏的操作, 也就是说sql语句中的order by也会影响分析函数的执⾏结果: [1] 两者⼀致:如果sql语句中的order by满⾜分析函数分析时要求的排序,那么sql语句中的排序将先执⾏,分析函数在分析时就不必再排序; [2] 两者不⼀致:如果sql语句中的order by不满⾜分析函数分析时要求的排序,那么sql语句中的排序将最后在分析函数分析结束后执⾏排序。
2、分析函数中的分组/排序/窗⼝分析函数包含三个分析⼦句:分组(partition by),排序(order by),窗⼝(rows/range)窗⼝就是分析函数分析时要处理的数据范围,就拿sum来说,它是sum窗⼝中的记录⽽不是整个分组中的记录,因此我们在想得到某个栏位的累计值时,我们需要把窗⼝指定到该分组中的第⼀⾏数据到当前⾏, 如果你指定该窗⼝从该分组中的第⼀⾏到最后⼀⾏,那么该组中的每⼀个sum值都会⼀样,即整个组的总和。
Oracle11gr2分析函数新特性简介(二)分析函数LISTAGG
Oracle 11g r2分析函数新特性简介(二)分析函数LISTAGG在11gr2中,Oracle分析函数的功能进一步增强。
这篇介绍新增的分析函数LISTAGG。
11gr2还新增了一个分析函数LISTAGG,这个函数的功能实现字符串的连接在11gr2中,Oracle终于实现了这个分析函数:SQL> select * from v$version;BANNER--------------------------------------------------------------------------------Oracle Database11gEnterprise Edition Release11.2.0.1.0 - 64bit ProductionPL/SQL Release 11.2.0.1.0 - ProductionCORE 11.2.0.1.0 ProductionTNS for Linux: Version 11.2.0.1.0 - ProductionNLSRTL Version 11.2.0.1.0 - ProductionSQL> create table t (id number, name varchar2(30), type varchar2(20));表已创建。
SQL> insert into t select rownum, object_name, object_type from dba_objects;已创建71968行。
SQL> commit;提交完成。
SQL> select listagg(name, ',') within group (order by id)2 from t3 where rownum < 10;LISTAGG(NAME,',')WITHINGROUP(ORDERBYID)-------------------------------------------------------------------------------------------SYS_C00644,SYS_LOB0000000528C00002$$,KOTTB$,SYS_C00645,SYS_LOB0000000532C00002$$,KOTAD$,SYS_C00646,SYS_L OB0000000536C00002$$,KOTMD$SQL> select type, listagg(name, ',') within group (order by id) name2 from t3 where type in ('DIRECTORY', 'JAVA SOURCE', 'SCHEDULE')4 group by type;TYPE NAME-------------------- ---------------------------------------------------------------------DIRECTORY ORACLE_OCM_CONFIG_DIR,DATA_PUMP_DIR,XMLDIRJAVA SOURCE dbFWTrace,schedFileWatcherJavaSCHEDULE DAILY_PURGE_SCHEDULE,FILE_WATCHER_SCHEDULE,BSLN_MAINTAIN_STATS_SCHEDSQL> select name,2 listagg(name, ',') within group (order by id) over(partition by type) s_name3 from t4 where type in ('DIRECTORY', 'JAVA SOURCE', 'SCHEDULE');NAME S_NAME本文URL地址:/database/Oracle/201410/45457.htm------------------------- -----------------------------------------------------------ORACLE_OCM_CONFIG_DIR ORACLE_OCM_CONFIG_DIR,DATA_PUMP_DIR,XMLDIRDATA_PUMP_DIR ORACLE_OCM_CONFIG_DIR,DATA_PUMP_DIR,XMLDIRXMLDIR ORACLE_OCM_CONFIG_DIR,DATA_PUMP_DIR,XMLDIRdbFWTrace dbFWTrace,schedFileWatcherJavaschedFileWatcherJava dbFWTrace,schedFileWatcherJavaDAILY_PURGE_SCHEDULE DAILY_PURGE_SCHEDULE,FILE_WATCHER_SCHEDULE,BSLN_MAINTAIN_STATS_SCHEDFILE_WATCHER_SCHEDULE DAILY_PURGE_SCHEDULE,FILE_WATCHER_SCHEDULE,BSLN_MAINTAIN_STATS_SCHED BSLN_MAINTAIN_STATS_SCHED DAILY_PURGE_SCHEDULE,FILE_WATCHER_SCHEDULE,BSLN_MAINTAIN_STATS_SCHED 已选择8行。
oracle分析函数
oracle分析函数
Oracle分析函数是一组可用于处理数据集的函数。
它们可以帮助您计算数据的统计值,如求和、平均值、最小值和最大值。
此外,还可以对数据进行细粒度的分析,如数据分析、排序、汇总和聚合。
Oracle分析函数可以提高您的应用程序的性能,并使您能够缩短查询时间。
Oracle分析函数可以分为三类:统计函数、数据分析函数和排序函数。
统计函数可以帮助您计算数据集的基本统计信息,如求和、平均值、最小值和最大值。
数据分析函数可以帮助您对数据进行更细粒度的分析,如数据分组、汇总和聚合。
排序函数可以帮助您对数据进行排序,以便更好地分析数据。
Oracle分析函数的优点非常明显,它们可以提高您的应用程序的性能,而且可以缩短查询时间。
此外,它们还可以提供准确和可靠的数据。
因此,当您需要对数据进行分析时,Oracle分析函数是一个有用的工具。
Oracle分析函数是一个强大的工具,可以帮助您更有效地处理数据集,从而获得更准确、更可靠的结果。
它们可以帮助您更快地执行复杂的查询,并使您能够更准确地分析和理解数据。
因此,Oracle 分析函数可以极大地提高您的业务分析效率,从而给您带来更大的利润。
Oracle数据库分析函数用法
Oracle数据库分析函数⽤法⽬录1、什么是窗⼝函数?2、窗⼝函数——开窗3、⼀些分析函数的使⽤⽅法4、OVER()参数——分组函数5、OVER()参数——排序函数1、什么是窗⼝函数?窗⼝函数也属于分析函数。
Oracle从8.1.6开始提供窗⼝函数,窗⼝函数⽤于计算基于组的某种聚合值,窗⼝函数指定了分析函数⼯作的数据窗⼝⼤⼩,这个数据窗⼝⼤⼩可能会随着⾏的变化⽽变化。
与聚合函数的不同之处是:对于每个组返回多⾏,⽽聚合函数对于每个组只返回⼀⾏基本语法: ‹分析函数› over (partition by ‹⽤于分组的列名› order by ‹⽤于排序的列名›)。
语法中的‹分析函数›主要由序列函数(rank、dense_rank和row_number等组成)与聚合函数(sum、avg、count、max和min等)作为窗⼝函数组成。
从窗⼝函数组成上看,它是group by 和 order by的功能组合,group by分组汇总后改变了表的⾏数,⼀⾏只有⼀个类别,⽽partiition by则不会减少原表中的⾏数。
恰如窗⼝函数的组成,它同时具有分组和排序的功能,且不减少原表的⾏数。
OVER 关键字表⽰把函数当成窗⼝函数⽽不是聚合函数。
SQL 标准允许将所有聚合函数⽤做窗⼝函数,使⽤ OVER 关键字来区分这两种⽤法。
2、窗⼝函数——开窗OVER 关键字后的括号中经常添加选项⽤以改变进⾏聚合运算的窗⼝范围。
如果 OVER 关键字后的括号中的选项为空,则窗⼝函数会对结果集中的所有⾏进⾏聚合运算。
分析函数 over(partition by 列名 order by 列名 rows between 开始位置 and 结束位置)为什么叫开窗呢?因为在over()括号中的,partition() 函数可以将查询到的数据进⾏单独开⼀个窗⼝处理。
譬如,查询每个班级的学⽣的排名情况,查询每个国家的历年⼈⼝等,诸如此类,都是在查询到的每⼀个班级、每⼀个国家中都开⼀个窗⼝,单独去执⾏命令。
Oracle分析函数
Oracle分析函数Oracle分析函数一——函数列表SUM :该函数计算组中表达式的累积和MIN :在一个组中的数据窗口中查找表达式的最小值MAX :在一个组中的数据窗口中查找表达式的最大值AVG:用于计算一个组和数据窗口内表达式的平均值。
COUNT :对一组内发生的事情进行累积计数-------------------------------------------------------------------------------------------------RANK :根据ORDER BY子句中表达式的值,从查询返回的每一行,计算它们与其它行的相对位置DENSE_RANK :根据ORDER BY子句中表达式的值,从查询返回的每一行,计算它们与其它行的相对位置FIRST :从DENSE_RANK返回的集合中取出排在最前面的一个值的行LAST :从DENSE_RANK返回的集合中取出排在最后面的一个值的行FIRST_VALUE :返回组中数据窗口的第一个值LAST_VALUE :返回组中数据窗口的最后一个值。
LAG :可以访问结果集中的其它行而不用进行自连接LEAD :LEAD与LAG相反,LEAD可以访问组中当前行之后的行ROW_NUMBER:返回有序组中一行的偏移量,从而可用于按特定标准排序的行号----------------------------STDDEV :计算当前行关于组的标准偏离STDDEV_POP:该函数计算总体标准偏离,并返回总体变量的平方根STDDEV_SAMP:该函数计算累积样本标准偏离,并返回总体变量的平方根VAR_POP :该函数返回非空集合的总体变量(忽略null)VAR_SAMP :该函数返回非空集合的样本变量(忽略null)VARIANCE :如果表达式中行数为1,则返回0,如果表达式中行数大于1,则返回VAR_SAMPCOVAR_POP :返回一对表达式的总体协方差COVAR_SAMP:返回一对表达式的样本协方差CORR :返回一对表达式的相关系数-------------------------------------------------------------------------------------------------CUME_DIST :计算一行在组中的相对位置NTILE :将一个组分为"表达式"的散列表示PERCENT_RANK:和CUME_DIST(累积分配)函数类似PERCENTILE_DISC:返回一个与输入的分布百分比值相对应的数据值PERCENTILE_CONT:返回一个与输入的分布百分比值相对应的数据值RATIO_TO_REPORT:该函数计算expression/(sum(expression))的值,它给出相对于总数的百分比REGR_ (Linear Regression) Functions:这些线性回归函数适合最小二乘法回归线,有9个不同的回归函数可使用----------------------------CUBE :按照OLAP的CUBE方式进行数据统计,即各个维度均需统计ROLLUP :SELECTdepartment_id,manager_id,employee_id,first_name||' '||last_name employee_name,hire_date,salary,job_idFROM employeesORDER BY department_id,hire_dateOracle分析函数二——函数用法Oracle分析函数实际上操作对象是查询出的数据集,也就是说不需二次查询数据库,实际上就是oracle实现了一些我们自身需要编码实现的统计功能,对于简化开发工作量有很大的帮助,特别在开发第三方报表软件时是非常有帮助的。
oracle的分析函数和开窗函数over()
oracle的分析函数和开窗函数over()⼀什么是分析函数1 概念 分析函数是Oracle专门⽤于解决复杂报表统计需求的功能强⼤的函数,它可以在数据中进⾏分组然后计算基于组的某种统计值,并且每⼀组的每⼀⾏都可以返回⼀个统计值。
2 和聚合函数的区别普通的聚合函数⽤group by分组,每个分组返回⼀个统计值,⽽分析函数采⽤partition by分组,并且每组每⾏都可以返回⼀个统计值。
3 开窗函数开窗函数指定了函数所能影响的窗⼝范围,也就是说在这个窗⼝范围中都可以受到函数的影响,有些分析函数就是开窗函数。
4 分析函数语法function_name (<argument>,<argument>...)OVER(<PARTITION-Clause><ORDER-BY-Clause><Windowing-Clause>)语法解释:1. function_name:对窗⼝中的数据进⾏操作,Oracle常⽤的分析函数有(这⾥就列举了⼀些常⽤的,其实有很多)①聚合函数sum:⼀个组中数据累积和min:⼀个组中数据最⼩值max:⼀个组中数据最⼤值avg:⼀个组中数据平均值count:⼀个组中数据累积计数②排名函数 row_number( ):返回⼀个唯⼀的值,当碰到相同数据时,排名按照记录集中记录的顺序依次递增。
rank( ):返回⼀个唯⼀的值,当碰到相同的数据时,此时所有相同数据的排名是⼀样的,同时会在最后⼀条相同记录和下⼀条不同记录的排名之间空出排名。
dense_rank( ):返回⼀个唯⼀的值,当碰到相同数据时,此时所有相同数据的排名都是⼀样的,同时会在最后⼀条相同记录和下⼀条不同记录的排名之间紧邻递增。
2. over:关键字,⽤于标识分析函数3. Partition-Clause:分区⼦句,根据分区表达式的条件逻辑将单个结果集分成N组格式: partition by...... 4. Order-by-Clause:排序⼦句,⽤于对分区中的数据进⾏排序格式:order by......5. Windowing-Clause:窗⼝⼦句,⽤于定义function在其上操作的⾏的集合,即function所影响的范围格式:order by字段名 range|rows between边界规则1 AND边界规则2边界规则的取值如下表所⽰:可取值说明CURRENT ROW当前⾏N PRECEDING前N⾏UNBOUNDED PRECEDING⼀直到第⼀条记录N FOLLOWING后N⾏UNBOUNDED FOLLOWING⼀直到最后⼀条记录 注意:RANGE表⽰按照值的范围进⾏范围的定义,⽽ROWS表⽰按照⾏的范围进⾏范围的定义⼆分析函数和开窗函数实例1 创建表格并插⼊数据--创建表格create table student(name varchar2(20),city varchar2(20),age int,salary int)--插⼊数据INSERT INTO student(name,city,age,salary) VALUES('Kebi','JiangSu',20,3000);INSERT INTO student(name,city,age,salary) VALUES('James','ChengDu',21,4000);INSERT INTO student(name,city,age,salary) VALUES('Denglun','BeiJing',22,3500);INSERT INTO student(name,city,age,salary) VALUES('Yangmi','London',21,2500);INSERT INTO student(name,city,age,salary) VALUES('Nana','NewYork',22,1000);INSERT INTO student(name,city,age,salary) VALUES('Sunli','BeiJing',20,3000);INSERT INTO student(name,city,age,salary) VALUES('Dengchao','London',22,1500);INSERT INTO student(name,city,age,salary) VALUES('Huge','JiangSu',20,2800);INSERT INTO student(name,city,age,salary) VALUES('Pengyuyan','BeiJing',24,4500);INSERT INTO student(name,city,age,salary) VALUES('Baoluo','London',25,8500);INSERT INTO student(name,city,age,salary) VALUES('Huting','ChengDu',25,3000);INSERT INTO student(name,city,age,salary) VALUES('Hurenxiang','JiangSu',23,2500);表格创建完后,查看表格中的内容2 聚合函数和开窗函数①单⼀的聚合函数count 案例:如果要求出student表中⼀共多少⼈select count(name) from student得到的结果 从上表中看出,得到的结果是⼀个值,即为student表中⼀共12个⼈②聚合函数count和开窗函数over( )的联合使⽤ 案例:如果查询每个⼯资⼩于4000元的员⼯信息(姓名,城市以及⼯资),并在每⾏中都显⽰所有⼯资⼩于4000元的员⼯个数 第⼀种实现⽅式:通过⼦查询实现select name,city ,salary,(select count(salary) from student where salary <4000) ⼯资⼩于4000⼈数from studentwhere salary <4000第⼆种实现⽅式:开窗函数over( )实现select name, city, salary,count(*) over()from studentwhere salary <4000解释⼀下:开窗函数count(*)over( )是对查询结果的每⼀⾏都返回所有符合条件⾏的条数;over关键字后的括号中的选项为空,则开窗函数会对结果集中的所有⾏进⾏聚合运算;over关键字后的括号中的选项为不为空,则按照括号中的范围进⾏聚合运算。
Oracle分析函数与分组关键字的用法
Oracle分析函数与分组关键字的用法以下是我以前工作中做报表常用的几个函数,在此分享一下,希望对大家有帮助。
(一)分析函数●row_numberPurposeROW_NUMBER is an analytic function. It assigns a unique number to each row to which it is applied (either each row in the partition or each row returned by the query), in the ordered sequence of rows specified in the order_by_clause, beginning with 1.You cannot use ROW_NUMBER or any other analytic function for expr. That is, you can use other built-in function expressions for expr, but you cannot nest analytic functions.按部门分组后根据工资排序,序号rn特征:连续、无并列select t.*, row_number() over(partitionby t.deptno orderby sal desc) rn from emp t;●rankPurposeRANK calculates the rank of a value in a group of values. Rows with equal values for the ranking criteria receive the same rank. Oracle then adds the number of tied rows to the tied rank to calculate the next rank. Therefore, the ranks may not be consecutive numbers.•As an aggregate function, RANK calculates the rank of a hypothetical row identified by the arguments of the function with respect to a given sort specification. The arguments of the function must all evaluate to constant expressions within each aggregate group, because they identify a single row within each group. The constant argument expressions and the expressions in the ORDER BY clause of the aggregate match by position. Therefore, the number of arguments must be the same and their types must be compatible.As an analytic function, RANK computes the rank of each row returned from a query with respect to the other rows returned by the query, based on the values of the value_exprs in the order_by_clause.按部门分组后根据工资排序,序号rn特征:不连续、有并列select t.*, rank() over(partitionby t.deptno orderby sal desc) rn from emp t;dense_rankPurposeDENSE_RANK computes the rank of a row in an ordered group of rows. The ranks are consecutive integers beginning with 1. The largest rank value is the number of unique values returned by the query. Rank values are not skipped in the event of ties. Rows with equal values for the ranking criteria receive the same rank.As an aggregate function, DENSE_RANK calculates the dense rank of a hypothetical row identified by the arguments of the function with respect to a given sort specification. The arguments of the function must all evaluate to constant expressions within each aggregate group, because they identify a single row within each group. The constant argument expressions and the expressions in the order_by_clause of the aggregate match by position. Therefore, the number of arguments must be the same and types must be compatible.As an analytic function, DENSE_RANK computes the rank of each row returned from a query with respect to the other rows, based on the values of the value_exprs in the order_by_clause.按部门分组后根据工资排序,序号rn特征:连续、有并列select t.*, dense_rank() over(partitionby t.deptno orderby sal desc) rn from emp t;(二)分组函数根据查询结果观察三者的区别,grouping sets用起来更灵活。
Oracle10gR2分析函数(中英对照版)
Orac c le100gR2分析函函数(Translat t ing By ca a izhuoyi)说明明:1、 2、 1 A Ana agg callewind calc of ro 分析分组一行物理Ana ORD befo in th 分析有wh 数只Ana repo 分析anaana原文中底色原文中淡蓝色nalytic Fu alytic function regate funct ed a window dow of rows culations for ows or a log 析函数通过将组返回多行值行都对应有一理行进行度量alytic functio DER BY claus ore the analy he select list 析函数是查询here ,gro 只能用于sel alytic functio orting aggre 析函数通常用alytic_fun alytic_fun 为黄的部分翻色字体的文字unctionsns compute tions in that w and is def s is defined. the current ical interval 将行分组后,再值。
分析函数根一个在行上滑量,也可以使ns are the la se. All joins ytic function t or ORDER B 询中除需要在up by ,和h ect 列或ord ns are comm gates.用于计算数据nction::=nction([ 翻译存在商榷字,不宜翻译an aggrega they return fined by the The window row. Window such as tim 再计算这些分根据analyt 滑动的窗口。
该使用逻辑区间ast set of op and all WHER s are proces BY clause.在最终处理的o having 子句der by 子句monly used 据累积值,数=arguments 榷之处,请大大家踊跃提意意见;译,保持原样样。
ORACLE_分析函数大全
ORACLE_分析函数大全1.SUM:计算指定列的总和。
用法:SUM(column) OVER (PARTITION BY expression ORDER BY expression)2.AVG:计算指定列的平均值。
用法:AVG(column) OVER (PARTITION BY expression ORDER BY expression)3.COUNT:计算指定列的记录数。
用法:COUNT(column) OVER (PARTITION BY expression ORDER BY expression)4.MAX:计算指定列的最大值。
用法:MAX(column) OVER (PARTITION BY expression ORDER BY expression)5.MIN:计算指定列的最小值。
用法:MIN(column) OVER (PARTITION BY expression ORDER BY expression)6.FIRST_VALUE:计算指定列的第一个值。
用法:FIRST_VALUE(column) OVER (PARTITION BY expression ORDER BY expression)ST_VALUE:计算指定列的最后一个值。
用法:LAST_VALUE(column) OVER (PARTITION BY expression ORDER BY expression)8.LEAD:返回指定行后的值。
用法:LEAD(column, offset, default) OVER (PARTITION BY expression ORDER BY expression)G:返回指定行前的值。
用法:LAG(column, offset, default) OVER (PARTITION BY expression ORDER BY expression)10.RANK:计算指定列的排名(相同值有相同的排名,相同排名后续排名跳过)。
Oracle10gR2分析函数(中英对照版)
Oracle10gR2分析函数(中英对照版)Orac c le100gR2分析函函数(Translat t ing By ca a izhuoyi)说明明:1、 2、 1 A Ana agg callewind calc of ro 分析分组⼀⾏物理Ana ORD befo in th 分析有wh 数只Ana repo 分析anaana原⽂中底⾊原⽂中淡蓝⾊nalytic Fu alytic function regate funct ed a window dow of rows culations for ows or a log 析函数通过将组返回多⾏值⾏都对应有⼀理⾏进⾏度量alytic functio DER BY claus ore the analy he select list 析函数是查询here ,gro 只能⽤于sel alytic functio orting aggre 析函数通常⽤alytic_fun alytic_fun 为黄的部分翻⾊字体的⽂字unctionsns compute tions in that w and is def s is defined. the current ical interval 将⾏分组后,再值。
分析函数根⼀个在⾏上滑量,也可以使ns are the la se. All joins ytic function t or ORDER B 询中除需要在up by ,和h ect 列或ord ns are comm gates. ⽤于计算数据nction::=nction([ 翻译存在商榷字,不宜翻译an aggrega they return fined by the The window row. Window such as tim 再计算这些分根据analyt 滑动的窗⼝。
该使⽤逻辑区间ast set of op and all WHER s are proces BY clause.在最终处理的o having ⼦句der by ⼦句monly used 据累积值,数=arguments 榷之处,请⼤⼤家踊跃提意意见;译,保持原样样。
oracle 分析函数应用
FROM employees
WHERE department_id < 100 ORDER BY department_id, salary;
2008-08-30
例子:下例中计算每个员工按部门分区再按薪水排序,依次出现的序列号(注意与 DENSE_RANK函数的区别)
SELECT d.department_id,
st_name,
e.salary, DENSE_RANK() OVER(PARTITION BY e.department_id ORDER BY e.salary) as drank
2008-08-30
10
8. RANK
功能描述:根据ORDER BY子句中表达式的值,从查询返回的每一行,计算它们与其它行 的相对位置 语法:RANK ( ) OVER ( [query_partition_clause] order_by_clause )
例子:下例中计算每个员工按部门分区再按薪水排序,依次出现的序列号(注意与 DENSE_RANK函数的区别)
2
Oracle 分析函数
Oracle 分析函数,在一般的应用系统开发时,使用较少,主 要集中报表开发,数据仓库应用中; 对于一些语句中使用分析函数,可以达到事半功倍的效果; Oracle的分析函数功能强大,可以用于SQL的优化,往往用 普通的SQL需要好几次表扫描的,用了分析函数后可以一句话解决;
2008-08-30
7
5. ROLLUP(CUBE)
功能描述:分组小计及汇总 语法:{ ROLLUP | CUBE } ( grouping_expression_list ) 例子:
ORACLE10g分析函数剖析
Oracle 分析函数(10G)一、Oracle分析函数简介1、分析函数,最早是从ORACLE8.1.6开始出现的,它的设计目的是为了解决诸如“累计计算”,“找出分组内百分比”,“前-N条查询”,“移动平均数计算”"等问题。
其实大部分的问题都可以用PL/SQL解决,但是它的性能并不能达到你所期望的效果。
分析函数是SQL言语的一种扩充,它并不是仅仅试代码变得更简单而已,它的速度比纯粹的SQL或者PL/SQL更快。
现在这些扩展已经被纳入了美国国家标准化组织SQL委员会的SQL规范说明书中。
2、在日常的生产环境中,我们接触得比较多的是OLTP系统(即OnlineTransaction Process),这些系统的特点是具备实时要求,或者至少说对响应的时间多长有一定的要求;其次这些系统的业务逻辑一般比较复杂,可能需要经过多次的运算。
比如我们经常接触到的电子商城。
在这些系统之外,还有一种称之为OLAP的系统(即Online Aanalyse Process),这些系统一般用于系统决策使用。
通常和数据仓库、数据分析、数据挖掘等概念联系在一起。
这些系统的特点是数据量大,对实时响应的要求不高或者根本不关注这方面的要求,以查询、统计操作为主。
Oracle分析函数,主要用于OLAP的系统中二、Oracle分析函数原理1、分析函数通过将行分组后,再计算这些分组的值。
它们与聚集函数不同之处在于能够对每一个分组返回多行值。
分析函数根据analytic claues(分析子句)将行分组,一个分组称为:一个窗口(可通过Windowsing Clause子句进行控制),并通过分析语句定义,对于每一行都对应有一个在行上滑动的窗口。
该窗口确定当前行的计算范围。
窗口大小可以用多个物理行(例如:rowid实际编号)进行度量,也可以使用逻辑区间进行度量,比如时间。
2、分析函数是查询中除需要在最终处理的order by 子句之外最后执行的操作。
oracle分析函数
Analytic Functions for Oracle8iAn Oracle Technical White Paper October 1999IntroductionThe SQL language, while extremely capable in many areas, has never provided strong support for analytic tasks. Basic business intelligence calculations such as moving averages, rankings and lead/lag comparisons have required extensive programming outside of standard SQL, often with performance challenges. Oracle8i Release 2 provides a powerful new family of functions which address this longstanding need. These functions are referred to as analytic functions, since they are useful in all types of analyses. The analytic functions enable enhanced performance and higher developer productivity. In addition, the functions are now under review by ANSI for addition to the SQL standard during 2000.Four Analytic FamiliesOracle has created four families of analytic functions, each of which contains several functions:•Ranking family - This family supports business questions like “show the top 10 and bottom 10 salesperson per each region” or “show, for each region, salespersons that make up 25% of the sales”. The functions examine the entire output before producing an answer. Oracle provides RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST and NTILE functions.•Window Aggregate family - This family addresses questions like “show the 13-week moving average of a stock price” or “show cumulative sum of sales per each region.” The new features provide moving and cumulative processing for all the SQL aggregate functions including AVG, SUM, MIN, MAX, COUNT, VARIANCE and STDDEV.•Reporting Aggregate family - One of the most common types of calculations is the comparison of a non-aggregate value to an aggregate value. All percent-of-total and market share calculations require this processing. The reporting aggregate family makes these sort of calculations simple: it lets users place values calculated at different aggregation levels on the same row. Without needing a join operation, users can now compare aggregate values to the detail rows. The new family provides reporting aggregate processing for all SQL functions including AVG, SUM, MIN, MAX, COUNT, VARIANCE and STDDEV.•LAG/LEAD family - Studying change and variation is at the heart of analysis. Necessarily, this involves comparing the values of different rows in a table. While this has been possible in SQL, usually through self-joins, it has not been efficient or easy to formulate. The LAG/LEAD family enables queries to compare different rows of a table simply by specifying an offset from the current row.Note that the analytic functions are intended to supplement the power of the relational database platform for decision support processing: they are not intended to supplant the role of specialized OLAP environments. Any OLAP product, such as Oracle Express, can leverage the power of the analytic functions to enhance its query performance.BenefitsThe analytic functions enhance both database performance and developer productivity. They are valuable for all types of processing, ranging from interactive decision support to batch report jobs. Corporate developers andindependent software vendors alike will be able to take advantage of the features. Here are key benefits provided by the new functions:•Improved Query Speed - The processing optimizations supported by these functions enable significantly better query performance. Actions which before required self-joins or complex procedural processing may now be performed in native SQL. The performance enhancements enabled by the new functions enhance query speeds for Oracle’s Express system and other ROLAP products.•Enhanced Developer Productivity - The functions enable developers to perform complex analyses with much clearer and more concise SQL code. Tasks which in the past required multiple SQL statements or the use of procedural languages can now be expressed using single SQL statements. The new SQL is quicker to formulate and maintain than the older approaches, resulting in greater productivity.•Minimized Learning Effort - Through careful syntax design, the analytic functions minimize the need to learn new keywords. The syntax leverages existing aggregate functions, such as SUM and AVG, so that these well-understood keywords can be used in extended ways.•Standardized Syntax - As part of the ANSI SQL standard, the new functions will be attractive for independent software vendors: they will have an incentive to adjust their products to take advantage of the new functions.Oracle is working with vendors of query, reporting and OLAP products to assist them in exploiting analytic functions. In the past, several database vendors have offered proprietary extensions in the same areas as these functions. However, those extensions did not achieve major market share, and few software vendors adjusted their products to support them. In contrast, the new analytic functions will be supported by a large number of independent software vendors. See Reference 1 for the ANSI proposal document.ConceptsTo perform their operations, the analytic functions add several new elements to SQL processing. These elements build on existing SQL to allow flexible and powerful calculation expressions. Here are the essential concepts used in the analytic functions:•Processing Order - Query processing using analytic functions takes place in three stages. First, all joins, WHERE, GROUP BY and HAVING clauses are performed. Second, the result set is made available to the analytic functions, and all their calculations take place. Third, if the query has an ORDER BY clause at its end, the ORDER BY is processed to allow for precise output ordering. The processing order is shown in Figure 1.Figure 1 - Processing order for Analytic functions•Result Set Partitions - The analytic functions allow users to divide query result sets into ordered groups of rows called partitions. Note that the term "partitions" used with analytic functions is unrelated to Oracle's table partitionsfeature. Throughout this paper, we use "partitions" only in the meaning related to analytic functions. Partitions are created after the groups defined with GROUP BY clauses, so any aggregate results such as SUM's and AVG's are available to them. Partition divisions may be based upon any desired columns or expression. A query result set may have just one partition holding all the rows, a few large partitions, or many small partitions holding just a few rows each.•Window - For each partition, a sliding window of data is defined. The window determines the range of rows used to perform the calculations for the "current row" (defined in the next bullet). Window sizes can be based on either a physical number of rows or a logical interval such as time. The window has a starting row and an ending row.Depending on its definition, the window may move at one or both ends. For instance, a window defined for a cumulative sum function would have its starting row fixed at the first row of its partition, and its ending row would slide from the starting point all the way to the last row of the partition. In contrast, a window defined for a moving average would have both its starting and end points slide so that they maintained a constant physical or logical range.A window can be set as large as all the rows in a partition. At the other extreme, it could be just a sliding set of tworows within a partition. Users can specify a window containing a constant number of rows, or a window containing all rows where a column value is in a specified range. Users can also define a window to hold all rows where a date value falls within the prior month.•Current Row - Each calculation performed with an analytic function is based on a current row within a window. The current row serves as the reference point determining the start and end of the window. For instance, a centered moving average calculation could be defined with a window that holds the current row, the 5 preceding rows and the6 rows ahead of it. This would create a sliding window of 12 rows, as shown in Figure 2.Figure 2 - Partition with a sliding window for a 12-row centered moving averageFeatures and ExamplesThis section describes the key features of the new functions and provides basic examples. Since this paper is a descriptive and conceptual discussion of the functions, it does not include syntax diagrams. However, it does offer many practical cases to show the value of the new functions. Most of the examples involve data from a sales table, where each row contains detail or aggregated sales data.Ranking familyFunctions in the ranking family compute the rank of a row with respect to other rows in the dataset based on the values of a set of measures. To meet the many needs of analytic processing, the ranking family offers several different functions. For all these functions, the following capabilities are available:•The ability to rank data based on multiple expressions. For example, users can rank on the combination of sales, profit, and seniority. This orders the data by sales, profit and seniority and then assigns ranks. The ability to rank on multiple expressions minimizes tie ranks.•The ranking order, ascending or descending, can be specified for each expression used in the function.•NULLs can be forced to rank last or first regardless of the direction of ordering (ascending versus descending).•Rank functions are reset on the boundary of each partition.•Ranks may be reset at the group boundaries produced by the GROUP BY CUBE and ROLLUP clauses.The following sections present key points about each function of the ranking family. Note that the RANK function, presented next, is covered in greater detail than the others. The concepts presented for RANK apply to the other functions and are not repeated for them.RANK FunctionThe RANK function produces an ordered ranking of rows starting with a rank of one. Users specify an optional PARTITION clause and a required ORDER BY clause. The PARTITION keyword is used to define where the rank resets. The specific column which is ranked is determined by the ORDER BY clause. If no partition is specified, ranking is performed over the entire result set. RANK will assign a rank of 1 to the smallest value unless descending order is used. The following example ranks salesmen for each region based on their sales amount.SELECT sales_person, sales_region, sales_amount,RANK() OVER (PARTITION BY s_region ORDER BY s_amount DESC) FROM Sales_table;sales_person sales_region sales_amount rankAdams East1001Baker East992Connors East893Davis East754Edwards West741Fitzhugh West662Gariabaldi West453when the region changes.Ranking using aggregate functions:In this example, the query contains a GROUP BY clause, and the rank is based on the sum of the sales for each product. This query ranks products within each region by their dollar sales:SELECT r_regionkey, p_productkey, sum(s_amount),RANK() OVER (PARTITION BY r_regionkeyORDER BY sum(s_amount) DESC)AS rank_of_product_per_regionFROM product, region, salesWHERE r_regionkey = s_regionkey AND p_productkey = s_productkeyGROUP BY r_regionkey, p_productkey;Rankings with different boundaries in a single query:A single query block can contain more than one ranking function each working on different groups (i.e., reset on different boundaries). The following query ranks products based on their dollar sales within each region(rank_of_product_per_region) and over all regions (rank_of_product_total).SELECT r_regionkey, p_productkey, sum(s_amount),RANK() OVER (PARTITION BY r_regionkeyORDER BY sum(s_amount) DESC)AS rank_of_product_per_region,RANK() OVER (ORDER BY sum(s_amount) DESC)AS rank_of_product_totalFROM product, region, salesWHERE r_regionkey = s_regionkey AND p_productkey = s_productkeyGROUP BY r_regionkey, p_productkeyORDER BY r_regionkey;r_regionkey p_productkeys_amount rank_of_product_per_regionrank_of_product_totaleast shoes13011east jackets9524east shirts8036east sweaters7547east t-shirts60511east ties50612east pants20714west shoes10012west jackets9923west t-shirts8935west sweaters7547west shirts7547west ties66610west pants45713Note that the two rightmost columns calculate two different rankings: the rank_of_product_total is a ranking across all rows, while the rank_of_product_per_region resets when region changes to west. Also note how tie values are handled in the ranking. Treatment of ties will be discussed in the section on the DENSE_RANK function.Per Cube- and Rollup- group Ranking:RANK can be reset based on the groupings provided by CUBE or ROLLUP operator introduced in Oracle8i. When using CUBE/ROLLUP, the boundaries between groups are found by the GROUPING(x) function which returns ‘0’if x is a part of the grouped columns and ‘1’ if it is a subtotal. GROUPING() flags can be used in the PARTITION BY clause to trigger resetting. For example, the following query ranks products and regions in their respective groupings. The rows which have a key value equal to null are subtotals. For clarity in the output table, where groups change the rows are separated by thicker bars.SELECT r_regionkey, p_productkey, sum(s_amount),RANK() OVER (PARTITION BY GROUPING(r_regionkey),GROUPING(p_productkey)ORDER BY sum(s_amount) DESC) AS rank_per_cubeFROM product, region, salesWHERE r_regionkey = s_regionkey AND p_productkey = s_productkeyGROUP BY CUBE(r_regionkey, p_productkey)ORDER BY GROUPING(r_regionkey), GROUPING(p_productkey), r_regionkey;r_regionkey p_productkey s_amount rank_per_cubeeast shoes1301east jackets5012east shirts806east sweaters757east t-shirts6011east ties954east pants2014west shoes1002west jackets993west shirts895west sweaters757west t-shirts757west ties6610west pants4513east null5102west null5491null shoes2301null jackets1495null shirts1692null sweaters1504null t-shirts1356null ties1613null pants657null null10591subtotals across all products by region, the seven rows with region equal to null are subtotals across all regions by product, and the last segment is the grand total.DENSE_RANK FunctionThe DENSE_RANK function is identical to the RANK function except for the way it handles tie values. When rankings include tie values, the key question is: What rank should we assign to the next value after the ties? Should the rank increase by one, or should there be a jump in the ranks corresponding to the number of tie values? The RANK function creates a jump in the ranks after a tie, using this formula:Next rank after a tie, using RANK = tied rank + number of tied valuesThus, if we have two rows tied for a rank of three, the next rank assigned will be five, rather than four. This means that in any ranked list where there are ties, the RANK function will have gaps in its sequence.In contrast, the DENSE_RANK function will not skip any rank values when assigning ranks: if a tie is encountered, it uses this formula:Next rank after a tie, using DENSE_RANK = tied rank + 1A rank computed with DENSE_RANK has no gaps in its sequence. Note also that the DENSE_RANK's highest value equals the count of distinct values in each partition. The table below highlights the difference in the two functions' output.person amount RANK DENSE_RANKAdams10011Baker10011Connors8932Davis7543Edwards7543Fitzhugh6664Garibaldi4575CUME_DIST (Cumulative Distribution) FunctionThe cumulative distribution function computes the relative position of a value relative to the other values in its partition. CUME_DIST is defined as the fraction of the rows, in the partition of a given row, that come before or are ties with the current value. It returns the results as a decimal value between zero and one, excluding zero and including one. The results of a CUME_DIST function are often called the percentile values. Default order is ascending, meaning that the lowest value in a partition gets the lowest CUME_DIST.To assign CUME_DIST to products per region based on their sales, we would use the following query. The "sales" data table in this example holds detailed sales data by product within region. To calculate our desired results, we need to sum the detailed records.SELECT r_regionkey, p_productkey, SUM(s_amount) as s_amount,CUME_DIST() OVER(PARTITION BY r_regionkey ORDER BY SUM(s_amount))AS cume_dist_per_regionFROM region, product, salesWHERE r_regionkey = s_regionkey AND p_productkey = s_productkeyGROUP BY r_regionkey, p_productkey ORDER BY r_regionkey, s_amount DESC;r_regionkey p_productkey s_amount cume_dist_per_regioneast shoes130 1.00east jackets950.86east shirts800.71east sweaters750.57east t-shirts600.43east ties500.29east pants200.14west shoes100 1.00west jackets990.86west t-shirts890.71west sweaters750.43west shirts750.43west ties660.29west pants450.14PERCENT_RANK FunctionPERCENT_RANK is very similar to CUME_DIST, but it uses rank values rather than row counts in its numerator. Therefore it returns the rank of a value relative to a group of values. The function is available in many popular spreadsheets. PERCENT_RANK of a row is calculated as:(rank of row in its partition - 1) / (number of rows in the partition - 1)PERCENT_RANK returns values in the range zero to one. The first row in any set ordered ASCENDING will have a PERCENT_RANK of zero.NTILE FunctionNTILE allows easy calculation of tertiles, quartiles, deciles and other common summary statistics. This function divides an ordered partition into a specified number of groups called "buckets" and assigns a bucket number to each row in the partition. Ntile is a very useful calculation since it lets users divide a data set into fourths, thirds, and other groupings.The buckets are calculated so that each bucket has exactly the same number of rows assigned to it or at most 1 row more than the others. For instance, if we have 100 rows in a partition and ask for an NTILE function with four buckets, 25 rows will be the assigned a value of one, 25 rows will have value two, and so on.If the number of rows in the partition do not divide evenly (without remainder) into the number of buckets, then the number of rows assigned per bucket will differ by one at most. The extra rows will be distributed one per bucket starting from the lowest bucket number. For instance, if there are 103 rows in a partition which has an NTILE(5) function, the first 21 rows will be in the first bucket, the next 21 in the second bucket, the next 21 in the third bucket, the next 20 in the fourth bucket and the final 20 in the fifth bucket.Using subqueries, we can construct complex rankings where resetting is done, not on column boundaries, but by numeric ranges. This allows us to rank data within buckets. For example, the following query divides sales per product into four buckets based on the sales amount and ranks each product in each quartile:SELECT p_productkey, sum_s_amount, 4_tile,RANK() OVER (PARTITION BY 4_tile ORDER BY sum_s_amount DESC)AS rank_in_quartileFROM(SELECT p_productkey,sum(s_amount) as sum_s_amount,NTILE(4) (ORDER BY sum(s_amount) DESC) as 4_tileFROM product, salesWHERE p_productkey = s_productkeyGROUP BY p_productkey);p_productkey sum_s_amount4_tile rank_in_quartileshoes10011jackets9012shirts8921sweaters7522shirts7531ties6632pants4541socks4541Note: The sub-query in this example is used only for the sake of clarity. We could have replaced "4_tile" directly in the PARTITION BY clause of the RANK function to avoid using the sub-query.ROW_NUMBER FunctionThe ROW_NUMBER function assigns a unique number, in the sequence defined by the ORDER BY clause, to each row within the partition. It is useful wherever rows need to be numbered on a partition-wise basis. Note that the ROW_NUMBER function has no connection with the ROWNUM pseudocolumn. The row numbers always start with one. Here is an example:SELECT p_productkey, s_amount,ROW_NUMBER() (ORDER BY s_amount DESC NULLS LAST) AS srnum FROM product, salesWHERE p_productkey = s_productkeyp_productkey s_amount srnumshoes1001jackets902shirts893t-shirts844sweaters755jeans756ties757pants698belts569socks4510suits NULL11Window Aggregate FamilyWindow aggregate functions operate on an ordered set of rows and for each row they return an aggregate value for a window rooted at that row. The window can be unbounded, as in “give me cumulative sum of my bank deposits over time,” or bounded as in “give me moving average of stock over 91 days.” As with the ranking functions, we can first partition the data into groups, for example accounts, and then apply windowing function within each group as in “for each account, give me cumulative sum of bank deposits ordered by day.”Although we refer to window aggregate functions as a family, there are only a few new keywords defined with them. It is also useful to view the family as a new syntax and behavior for existing aggregate functions, SUM, AVG, MIN, MAX, STDDEV, VARIANCE and COUNT.The following capabilities are available for window aggregates:•All the existing aggregate functions can be used as window aggregates: SUM, AVG, MIN, MAX, STDDEV, VARIANCE and COUNT. Likewise, the new regression analysis functions introduced along with the analytic functions are supported. These include VAR_SAMP, VAR_POP, STDDEV_SAMP, STDDEV_POP, COVAR_SAMP, COVAR_POP, REGR_SLOPE, REGR_INTERCEPT, REGR_R2, REGR_AVGX, REGR_AVGY, REGR_COUNT, REGR_SXX, REGR_SXY AND REGR_SYY.•Each window aggregate function can have an optional clause defining the size of the window. If the clause is missing, then the window is unbounded: it starts at the beginning of the table or partition and ends at the end of the table or partition.•Each window function can have its own ordering clause like those available in the rank family.•Window aggregate functions are reset on the boundary of each partition.•Each window aggregate function can order data based on multiple expressions. The ability to rank on multiple expressions minimizes ties and allows repeatable results.•Ascending or descending order can be specified for each expression used in the function.•NULLs can be forced to rank last or first regardless of the direction of ordering (ascending or descending).Defining window sizesWindows can be defined in several different ways, depending on user needs:•Physical - A physical window size may be expressed in terms of rows, e.g., “return the moving average of the last 10 rows before the current row.” Along with numeric expressions specifying the size of the window, there are several keywords available:CURRENT ROW - specifies that window starts or ends at the current rowUNBOUNDED PRECEDING - specifies that the window starts at the first row of the partition.UNBOUNDED FOLLOWING - specifies that the window ends at the last row of the partition.•Time Interval - A logical window size may be expressed in terms of time interval, e.g., “return the moving average of stock price for the 91 days before the current row's date.” Another example is “return the moving average of temperature over the 100 seconds prior to the current row's time.” In these cases the ordering expression shall evaluate to a date expression so that time interval can be applied to it.•Value Range - A logical window size may be expressed in terms of a difference between the current value and a value preceding it in the ordered sequence. As long as that difference is less than a given size, then the row belongs to the window. For example, in the query “Return the average height of all people who are no more than three inches shorter than the person in the current row,” the window includes all rows where “Height BETWEEN current.Height AND current.Height - 3”.The following is an example of a cumulative balance per account ordered by deposit date.SELECT Acct_number, Trans_date, Trans_amount,SUM(Trans_amount) OVER (PARTITION BY Acct_numberORDER BY Trans_date ROWS UNBOUNDED PRECEDING) AS Balance FROM LedgerORDER BY Acct_number, Trans_date;Acct_number Trans_date Trans_amount Balance738291998-11-01113.45113.45738291998-11-05-52.0161.44738291998-11-1336.2597.69829301998-11-0110.5610.56829301998-11-2132.5543.11829301998-11-29-5.0238.09Example of Moving Aggregate function:Here is an example of a time-based window that shows, for each transaction, the moving average transaction amount for the preceding 30 days of transactions:SELECT Trans_date, Trans_amount,AVG (Trans_amount) OVER (ORDER BY Trans_date)RANGE INTERVAL '30' DAY PRECEDING)FROM Ledger;Example of Centered Aggregate function:Calculating windowing aggregate functions centered around the current row, for example the centralized average, is relatively straight forward. This example computes a centered moving average for the current row, including 2 rows prior to the current row and 2 rows following the current row.SELECT st_timekey,AVG(s_amount) OVER(ORDER BY st_timekeyROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS cavgFROM stockWHERE st_name = ‘orcl’;FIRST_VALUE and LAST_VALUE FunctionsThe FIRST_VALUE and LAST_VALUE functions help users derive full power and flexibility from the window aggregate functions. They allow queries to select the first and last rows from a window. These rows are specially valuable since they are often used as the baselines in calculations. For instance, with a partition holding sales data ordered by day, we might ask "How much was each day's sales compared to the first sales day (FIRST_VALUE) of the period?" Or we might wish to know, for a set of rows in increasing sales order, "What was the percentage size of each sale in the region compared to the largest sale (LAST_VALUE) in the region?"Reporting Aggregate familyComparing values at one level of aggregation to values at a different level of aggregation is an essential part of analytical work. For instance, we may need to compare employee salary with average salary per department, and we may want to find the average salary excluding his salary. In turn, we may need to compare average salary by department to average salary across all departments. The reporting aggregate family enables a row to include values at differing levels of aggregation without the need for self-joins. Unlike SQL aggregate functions which return one row per group, for example sum per group, the reporting aggregate functions work at the window level: they will return the same aggregate value for every row in a window.The reporting aggregate functions will compute either an aggregate of an entire window or an aggregate of the entire window excluding the current row. They are analogous to the Window aggregate functions in most ways, and have similar capabilities.Example of Reporting Aggregate function:Consider the question "for each product, find the region in which it had maximum sales." Using the reporting aggregate functions, we can find the answer using an in-line view:SELECT s_productkey, s_regionkey, sum_s_amountFROM(SELECT s_productkey, s_regionkey, SUM(s_amount) as sum_s_amount,MAX(SUM(s_amount)) OVER(PARTITION BY s_productkey) as max_sum_s_amountFROM salesGROUP BY s_productkey, s_regionkey)WHERE sum_s_amount = max_sum_s_amount;The in-line view returns the following:s_productkey s_regionkey sum_s_amount max_sum_s_amountjackets west9999jackets east5099pants east2045pants west4545shirts east6080shirts west8080shoes west100130shoes east130130sweaters west7575sweaters east7575ties east9595ties west6695The outer query would return the information needed:s_productkey s_regionkey sum_s_amountjackets west99pants west45shirts west80sweaters west75sweaters east75shoes east130ties east95Without using the reporting aggregate functions, this query could only have been done by first computing the value of maximum sales per region for each product, and then joining it with an with the sales table. The new technique saves the cost of processing the join.。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Orac c le100gR2分析函函数(Translat t ing By ca a izhuoyi)说明明:1、 2、 1 A Ana agg callewind calc of ro 分析分组一行物理Ana ORD befo in th 分析有wh 数只Ana repo 分析anaana原文中底色原文中淡蓝色nalytic Fu alytic function regate funct ed a window dow of rows culations for ows or a log 析函数通过将组返回多行值行都对应有一理行进行度量alytic functio DER BY claus ore the analy he select list 析函数是查询here ,gro 只能用于sel alytic functio orting aggre 析函数通常用alytic_fun alytic_fun 为黄的部分翻色字体的文字unctionsns compute tions in that w and is def s is defined. the current ical interval 将行分组后,再值。
分析函数根一个在行上滑量,也可以使ns are the la se. All joins ytic function t or ORDER B 询中除需要在up by ,和h ect 列或ord ns are comm gates.用于计算数据nction::=nction([ 翻译存在商榷字,不宜翻译an aggrega they return fined by the The window row. Window such as tim 再计算这些分根据analyt 滑动的窗口。
该使用逻辑区间ast set of op and all WHER s are proces BY clause.在最终处理的o having 子句der by 子句monly used 据累积值,数=arguments 榷之处,请大大家踊跃提意意见;译,保持原样样。
ate value bas multiple row analytic_w determines w sizes can me.sed on a gro ws for each g_clause . F s the range be based ooup of rows. group. The g For each row of rows use on either a p They differ group of row w, a sliding d to perform hysical num fromws is m the mber分组的值。
它tic_claue 该窗口确定当进行度量,它们与聚集函es 将行分组,当前行的计算比如时间。
函数不同之处一个分组称算范围。
窗口处在于能够对称为一个窗口口大小可以用对每一口。
每多个perations per RE , GROUP B ssed. There rformed in a BY , and HAV efore, analyti a query exce VING clause ic functions ept for the fin es are compl can appear nal leted only order by 子句都要在处理句中。
子句之外最后理分析函数之后执行的操作之前进行计算作。
所有连接。
因此,分析接和所析函to compute 据移动值、数s ]) cumulative,数据中间值, moving, ce e ntered, and ,和输出集合合报表。
O VER (ana a lytic_cla a use)ana[ q[ oquePAR { | }ordORD { e [ A [ N [, ].winalytic_cla query_part order_by_c ery_partit RTITION BY value_ex ( value_der_by_cla ause::=DER [ SIBL expr | pos ASC | DESC NULLS FIRS , { expr [ ASC | [ NULLS ...ndowing_cl tition_cl clause [ tion_clau Yxpr[, valu _expr[, vaause::= LINGS ] B sition | C ]ST | NULL | positio DESC ]FIRST | lause ::=lause ] windowing use::=ue_expr ]alue_expr BYc_alias }LS LAST ]on | c_al NULLS LAS=g_clause ...r ]... )lias }ST ] ] ]{ R { B { | | } AN { | | }| { | | }}The以下1.1 Spe disc Ana1.2 Ana any deterem ROWS | RAN BETWEENUNBOUNDE CURRENT value_exNDUNBOUNDE CURRENT value_ex UNBOUNDECURRENT value_ex e semantics 下各节将讨论analytic_fu ecify the nam cussion of se alytic_fun Arguments alytic functio nonnumeric ermines the aining argum NGE }ED PRECEDI ROWxpr { PREC ED FOLLOWI ROWxpr { PREC ED PRECED ROWxpr PRECED of this synta 论分析函数语unctionme of an ana emantics).nction 指定sns take 0 to c datatype th argument w ments to tha INGCEDING | INGCEDING | DING DINGax are discus 语法的语义。
alytic functio 定分析函数的3 argument hat can be im with the high at datatype. FOLLOWING FOLLOWING ssed in the s on (see the li 的名称。
(请参ts. The argu mplicitly con est numeric The return t G }G } sections tha isting of ana 参阅以下语义uments can b nverted to a c precedence type is also at follow. alytic functio 义论述中的分be any num numeric dat e and implic that datatyp ns following 分析函数列表eric datatyp tatype. Orac citly converts pe, unlessg this 表) pe or cle s theotherwise noted for an individual function.分析函数可取0-3个参数。
参数可以是任何数字类型或是可以隐式转换为数字类型的非数字类型。
Oracle根据最高数字优先级别确定函数参数,并且隐式地将需要处理的参数转换为适合的数据类型。
函数的返回类型也为数字类型,除非此函数另有说明。
See Also:"Numeric Precedence" for information on numeric precedence and Table 2-10, "Implicit Type Conversion Matrix" for more information on implicit conversion请参阅:"Numeric Precedence"可获取数字优先级的相关信息,参阅表2-10—隐式类型转换矩阵,可获取隐式转换的更多信息。
1.3analytic_clauseUse OVER analytic_clause to indicate that the function operates on a query result set. That is, it is computed after the FROM, WHERE, GROUP BY, and HAVING clauses. You can specify analytic functions with this clause in the select list or ORDER BY clause. To filter the results of a query based on an analytic function, nest these functions within the parent query, and then filter the results of the nested subquery.Analytic_clause用以指明分析函数操作的是一个查询结果集。
也就是说分析子句在from,where,group by,和having子句之后才开始计算。
因此在选择列或order by子句中可以使用带有分析子句的分析函数。
为了过滤分析函数的查询结果,可以将分析函数嵌套在父查询中,然后过滤嵌套子查询的查询结果。
Notes on the analytic_clause: The following notes apply to the analytic_clause: Analytic_clause注意事项:使用分析子句注意事项如下:•You cannot specify any analytic function in any part of the analytic_clause.That is, you cannot nest analytic functions. However, you can specify an analyticfunction in a subquery and compute another analytic function over it.Analytic_clause的任何部分都不能指定任何分析函数。