This is an old story about Java, but conceptually applicable to whatever you might be using. About 2008 I got the call that a nationally recognized car rental firm was struggling with the performance of their quoting system. A team had spent about a month trying to get the application to work, with no luck. They would run a few transactions through and watch as CPU and memory spiked, never getting more than a few requests to respond in any way.
The first thing I did was realize that their basic install was broken. None of the team actually knew anything about application development; they were fundamentally all systems administrators, focused on incantations to make the operating system and network stack hum. First lesson: Make sure you are actually performing a valid test in a valid environment.
Once we started generating traffic against the application, a whole slew of apparently innocuous errors appeared. Back in the day, I’d walk into a typical environment and be greeted by thousands of seemingly innocent exceptions piling up in some log file. JNDI, which for the uninitiated you can think of as a configuration store, was a frequent culprit as someone would forget or more likely fail to even attempt to configure it correctly.
This quoting system was overly simple, as most of its functionality was about queuing up calls to send to the real action, somewhere back on a mainframe. So with few actual transaction types, it was even easier to see that more time was being spent waiting on JNDI than on actual functionality. Because JNDI lookups were broken, 250ms was being added to every transaction, waiting for a lookup to fail and an exception to be thrown. To make it even worse, the value being referenced through an expensive lookup was really just a static value anyway.
Two lines of code later, I’d shaved that time off every transaction. Isn’t it amazing how minor flaws can be fixed to major benefits?