Command-line Tools can be 235x Faster than your Hadoop Cluster

I love stories like Adam’s. Too many people propose complex solutions to simple problems because they get excited about building something cool, and lose sight of their actual purpose: building a useful tool. I see this all the time at work. In this case, another developer used a Big Data approach for a Small Data problem, and Adam shows how a much simpler — but less cool(?) — one would have got it done much, much faster. Had this other developer understood the underlying technologies, and taken some time for premature optimization, he might have gotten there himself.