要得到一组数据的中位数(例如某个地区或某家公司的收入中位数),我们一般要将这一任务细分为 3 个小任务:
1.将数据排序,并给每一行数据给出其在所有数据中的排名;
2.找出中位数的排名数字;
3.找出中间排名对应的值;
下面以某公司员工月收入为例,示例 mysql 的一些复杂语句的使用。
方法一
创建测试表
首先创建一个收入表,建表语句为:
create table if not exists `employee` ( `id` int auto_increment primary key, `name` varchar(10) not null default '', `income` int not null default '0') engine = innodb default charset = utf8;insert into `employee` (`name`, `income`)values ('麻子', 20000);insert into `employee` (`name`, `income`)values ('李四', 12000);insert into `employee` (`name`, `income`)values ('张三', 10000);insert into `employee` (`name`, `income`)values ('王二', 16000);insert into `employee` (`name`, `income`)values ('土豪', 40000);
完成任务 1
将数据排序,并给每一行数据给出其在所有数据中的排名:
select t1.name, t1.income, count(*) as rankfrom employee as t1, employee as t2where t1.income < t2.income or (t1.income = t2.income and t1.name <= t2.name)group by t1.name, t1.incomeorder by rank;
查询结果为:
完成小任务 2
找出中位数的排名数字:
select (count(*) + 1) div 2 as rankfrom employee;
查询结果为:
完成小任务 3
select income as medianfrom (select t1.name, t1.income, count(*) as rank from employee as t1, employee as t2 where t1.income < t2.income or (t1.income = t2.income and t1.name <= t2.name) group by t1.name, t1.income order by rank) t3where rank = (select (count(*) + 1) div 2 from employee)
查询结果为:
至此,我们就找到了如何从一组数据中获得中位数的方法。
方法二
下面,来介绍另外一种优化排名语句的方法。
我们都知道如何给一组数据做排序操作,在本例中,实现方法如下:
select name, incomefrom employeeorder by income desc
查询结果为:
那我们可不可以更进一步,对查询出的结果加一列,这一列的数据为排名呢?
我们可以通过 3 个自定义变量的方法来实现这一目标:
第一个变量用来记录当前行数据的收入
第二个变量用来记录上一行数据的收入
第三个变量用来记录当前行数据的排名
set @curr_income := 0;set @prev_income := 0;set @rank := 0;select `name`, @curr_income := income as income, @rank := if(@prev_income != @curr_income, @rank + 1, @rank) as rank, @prev_income := @curr_income as dummyfrom employeeorder by income desc
查询结果如下:
然后再找出中位数的排名数字,进一步找出收入的中位数:
set @curr_income := 0;set @prev_income := 0;set @rank := 0;select income as medianfrom (select `name`, @curr_income := income as income, @rank := if(@prev_income != @curr_income, @rank + 1, @rank) as rank, @prev_income := @curr_income as dummy from employee order by income desc) as t1where t1.rank = (select (count(*) + 1) div 2 from employee)
查询结果为:
至此,我们找了两种方法来解决中位数的问题。撒花。
推荐:《mysql教程》
以上就是在 mysql 中,如何计算一组数据的中位数的详细内容。