AI's Promise: Unlocking Value with Observability and Trust

AI's Promise:  Unlocking Value with Observability and Trust

Debanjan Saha, CEO of DataRobot and a seasoned technology leader with experience at Google, AWS and IBM, argues that effective AI decision-making hinges on visibility and trust. While generative AI has captured widespread attention, enterprises are lagging in adopting AI at scale due to concerns over complex models and opaque decision-making processes.

AI observability, drawing on the established concept of software observability, presents a solution. This approach allows organisations to not only monitor and measure AI applications but also proactively mitigate risks, enabling them to unlock AI's transformative potential and drive business value.

Defining AI Observability

Observability is an engineering principle that involves assessing the performance of a system or application by analysing its outputs. This practice, traditionally employed to monitor intricate computing systems, including cloud infrastructure, enterprise applications, and business processes, has now been extended to the realm of AI.

Bryan Cantrill, a prominent figure in observability engineering, highlighted the problem in 2006: "We have built mind-bogglingly complicated systems that we cannot see, allowing glaring performance problems to hide in broad daylight in our systems."

Similarly, contemporary advanced AI and machine learning models are inherently complex, making it challenging to pinpoint and address issues, especially at scale. As organisations integrate AI into their operations, they face the daunting task of managing hundreds of models, assets, and environments.

Without insights into model performance, organisations risk encountering:

Deterioration in Model Output Quality: AI models can suffer from a gradual or sudden decline in quality due to model drift, which occurs when the data or relationships within the data change over time. ChatGPT, a popular generative AI tool, faced a well-documented drop in performance last winter, even prompting its developers to acknowledge the issues.

Safety and Compliance Risks: AI models introduce new cybersecurity risks, both in terms of novel attack vectors and the potential for unintentional leaks of sensitive information. The Open Worldwide Application Security Project (OWASP) has identified the top 10 risks posed by generative AI, highlighting the complexity of monitoring these emerging vulnerabilities across a vast system landscape.

Cost Overruns: AI applications, particularly generative AI, demand immense computational power. It is difficult to predict resource consumption and associated costs. Without effective controls, organisations can experience significant cost overruns.

These risks, often overlooked amidst the AI hype, pose a significant threat to the success of AI initiatives. Fortunately, the industry has extended observability concepts to the realm of AI, providing crucial capabilities for real-time remediation.

Why Observability Is Vital for AI Success

AI observability goes beyond merely keeping systems online. When implemented effectively, it provides benefits that can enhance the entire AI strategy:

Confidence in the AI Program: Many organisations have been hesitant to deploy AI in production due to uncertainty about their ability to mitigate potential downsides. Boston Consulting Group and MIT found that 71% of organisations struggle to manage AI-related risks. Observability can instil the confidence to move forward with AI, knowing that issues can be identified and addressed proactively.

Trust in AI Solutions: High-profile AI failures, including inaccurate or inappropriate outputs and data leaks, have raised concerns about the trustworthiness of AI applications. Observability enables monitoring and prevention of issues like accuracy problems, prompt injections, and toxicity.

Sustaining AI Value: Observability helps organisations ensure a lasting return on their AI investments. It allows for the calculation of ROI and resource consumption tracking, ensuring that deployment costs align with benefits. When issues arise, such as unexpected spikes in resource consumption, they can be promptly remediated to restore ROI.

Best Practices for Maximising AI Value

The implementation of AI observability should not solely rely on tools. It requires a holistic approach that includes people and processes.

Here are some best practices to preempt threats and issues that could undermine the value of AI programmes:

Reading the Fine Print: Services like ChatGPT and Google’s Bard are trained on user input data. It is crucial to understand their terms and conditions and establish appropriate guidelines before allowing employees to use them.

Defining Processes Upfront: Implementing a process for reviewing new AI projects ensures that all stakeholders are involved and that data usage aligns with best practices, guaranteeing ROI and responsible data management.

Prioritising Critical Areas: Ensure that crucial areas for maximising AI value, such as cost, ROI, accuracy, and reputational risk, are consistently tracked and monitored.

Going Beyond Observability: Observability alone is insufficient. Organisations require the capability to intervene and remediate issues in real time. The platform should seamlessly transition from issue identification to resolution.

Tailoring the observability stack and governance program to an organisation's specific culture, size, and industry is paramount.

Conclusion

AI's transformative potential is greater than ever. With observability and a robust approach to managing AI investments, organisations can cultivate trust, control risks, and unlock lasting value from AI.