您好,欢迎访问一九零五行业门户网

Scaling Big Data Mining Infrastructure at Twitter

i’m almost always enjoying the lessons learned-style presentations from twitter’s people. the slides below, by jimmy lin and dmitriy ryaboy, have been used at hadoopsummit. besides the technical and practical details, there are two thing
i’m almost always enjoying the lessons learned-style presentations from twitter’s people. the slides below, by jimmy lin and dmitriy ryaboy, have been used at hadoopsummit. besides the technical and practical details, there are two things that i really like:
dj patil: “it’s impossible to overstress this: 80% of the work in any data project is in cleaning the data”
and then the reality check:
your boss says something vagueyou think very hard on how to move the needlewhere’s the data?what’s in this dataset?what’s all the f#$#$ crap in the data?clean the datarun some off-the-shelf data mining algorithm…productionize, act on the insightrinse, repeatenjoy!
scaling big data mining infrastructure twitter experience
original title and link: scaling big data mining infrastructure at twitter (nosql database?mynosql)
原文地址:scaling big data mining infrastructure at twitter, 感谢原作者分享。
其它类似信息

推荐信息