mysqlquery
开发市场调查业务,有一个调查问题表responses (一百万行), 是市场调查的原始数据,包含survey_id(调查表的类型列),response_no(被调查者列),interview_date(调查日期), question_label(问题列), value(回答列), section_unique_id(部门列)等。
每一行对应某个被调查者回答的一个问题和它的调查结果。一个被调查者一次会回答30个问题,所以会产生30行。
另有一个计算公式表 (40行), 是对调查结果的分析计算公式,这些公式都是sql语句。
根据计算公式,产生一个结果表results。
现在的问题是,要设计一些sql语句,调查有多少人的回答是类似如下这样的组合:
1. 回答问题q1,答案是1或8或9
并且
2. 回答问题q2,答案是1或8或9
并且
。。。。。。
最好能用group by section。
比如如下这个例子:
计算公式的说明是这样。
((q2a = 1 or q2a = 8 or q2a = 9) and (q2b = 1 or q2b = 8 or q2b = 9) and (q2c = 1 or q2c = 8 or q2c = 9) and (q2d = 1 or q2d = 8 or q2d = 9) and (q2e = 1 or q2e = 8 or q2e = 9) and (q2f = 1 or q2f = 8 or q2f = 9) and (q2g = 1 or q2g = 8 or q2g = 9) and (q2h = 1 or q2h = 8 or q2h = 9) and (q2i = 1 or q2i = 8 or q2i = 9) and (q5 = 1 or q5 = 8 or q5 = 9) and (q6 = 1 or q6 = 8 or q6 = 9))
我现在写了这样一个mysql语句:
select section_unique_id as section_unique_id, count(*) as hit from responses where
question_label = q2a and value in (1,8,9)
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2b and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2c and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2d and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2e and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2f and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2g and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2h and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q2i and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q5 and value in (1,8,9))
and
(response_no, survey_id, interview_date) in (select distinct response_no, survey_id, interview_date from responses where question_label = q6 and value in (1,8,9))
结果,运行一次,用了12秒,太慢了。
请教mysql高手,有没有办法能加快计算速度。