The importance of tracing – SD Times

business-composition-creativity-desk-545053.jpg

Tracing, according to Lightstep CTO and co-founder Daniel ‘Spoons’ Spoonhower, provides context, which serves as the backbone for what’s happening when an application’s performance degrades. “Tracing is really just understanding causal relationships in your software,” he explained. “It sounds obvious in retrospect, but using causal relationships to form the way that data is collected, analyzed and stored, and then the way it’s presented back to users is critical,” especially today when the ownership of services is not all in one set of hands, and mountains of data from these disparate parts keeps streaming in.

RELATED CONTENT: The modern world of application monitoring

Spoonhower discussed the methodology at Google, where he previously worked, about how the company managed its data. “Google took an approach where they sampled very heavily and threw out 99.99% of the data, and that was their way of managing the infrastructure costs of that observability system. For the rest of the world outside of Google, there’s a lot of value in a lot of that data. What Lightstep does is to look at 100% of the data and then use internal models of what’s going on to figure out which of that data to keep and which to throw away, again, partly based on tracing. Seeing 100% of the data lets us sort of dig in into one of these arbitrary tens of thousands of different causes that might be at fault. 

“You can think of what we’re doing is we’re taking 100% of the data and building a global model of what’s happening in the application, and then we’re using that global model to inform and refine the collection and analysis process,” he went on. “From a technologist point of view, the way that people often build these systems, they think of it as a linear pipeline, like every other ETL solution out there. ‘I’ve got telemetry data; I’ve got to run it through Spark streaming.’ But I think the causal interconnectedness of the data means that you need a more global view of what’s going on. It’s just not enough to run it through a linear pipeline.”

Credit: Source link