Command-line Tools can be 235x Faster than your Hadoop Cluster

I love stories like Adam’s. Too many people propose complex solutions to simple problems because they get excited about building something cool, and lose sight of their actual purpose: building a useful tool. I see this all the time at work. In this case, another developer used a Big Data approach for a Small Data problem, and Adam shows how a much simpler — but less cool(?) — one would have got it done much, much faster. Had this other developer understood the underlying technologies, and taken some time for premature optimization, he might have gotten there himself.

Permalink.