Support and operations are undergoing a transformation: artificial intelligence is increasingly turning operational tasks into strategic tools. Martin Weber, director service management at diconum, explains how AI is positioning support and operations (S&O) as a true business enabler, significantly increasing efficiency in the process. In this interview, you will learn how AI is already adding value in practice today, from intelligent search and analysis assistants to fully automated incident response. The interview specifically addresses hurdles in data protection in cloud AI, security through SIEM toolkits and SRE (Site Reliability Engineering) principles that enable the transition from reactive to proactive, AI-supported processes.
Martin Weber: AI is driving fundamental change in support and operations. It enables us to anticipate incidents and reduce downtime by shifting from reactive ticket handling to proactive system management. Continuous optimization is automated, and manual effort is massively reduced. This allows S&O to scale without proportionally increasing headcount—and positions it as a true business enabler.
New roles such as AI operations analysts and automation strategists are emerging. AI takes over simple and medium-level tickets, allowing teams to focus on optimization and complex cases. This requires further training at the SRE level. The classic tiered model is being replaced by AI-supported SRE teams.
In terms of processes, AI replaces manual triage with root cause analysis and classification. It becomes the core of value creation – with industrial, AI-based process chains. Knowledge management improves significantly, and relevant information is easier to access. This reduces solution times and increases quality.
Collaboration also benefits: AI provides insights into operational weaknesses and improves coordination with development teams and stakeholders. Real-time metrics help quantify service impacts and shorten innovation cycles.
AI is already proving its value in several key areas, here comes six examples:
Support in particular requires access to highly sensitive data, ranging from infrastructure details to personal customer data. Cloud-based AI solutions pose significant data protection hurdles. US providers are subject to the Cloud Act, which means that data transfers outside the EU cannot be ruled out. Security breaches and non-transparent data use for training purposes are real risks, even with the strictest settings.
My recommendation: Local installations offer clear advantages in terms of security and data protection, as these can be fully controlled. In addition, different AI models can be flexibly combined instead of being tied to a single provider. At diconium, we are pioneers in this field and offer services based on local LLMs in highly secure environments or their implementation and operation in our customers' private clouds.
What types of threats do companies have to reckon with when operating AI-supported support systems, and how can these risks be minimized?
When operating AI-supported support systems, we have to expect typical threats from software vulnerabilities, imperfect operating processes, and human negligence or even intent. Added to this are previously unknown threats.
Even the best risk management measures cannot guarantee that an IT system will not be successfully attacked without residual risk. The key to managing these risks is a complete SIEM (Security Information and Event Management) tool stack. Static code analysis tools cover only a small part of the production environment; a SIEM stack, on the other hand, monitors the entire production environment without gaps. It helps to actively detect malicious activities as soon as they occur, even if attackers exploit unknown vulnerabilities.
Specifically, we rely on tools such as:
These components, combined with strict access controls and data separation, are essential for detecting and minimizing the impact of an attack if it does occur. They form the basis for a high level of trust and provide 24/7 global security.
The mindset of operating the system as if it were your own is absolutely critical. When availability, reliability, and security become personal priorities, problems can be fixed at their root before they escalate. This strengthens personal responsibility, continuous risk assessment, and the reduction of technical debt—for stability and trust.
For proactive, AI-supported support, we combine ITIL and DevOps: ITIL provides structure and governance, while DevOps delivers speed through automation. This creates stability without compromising speed – and IT becomes a true business partner.
We rely on comprehensive monitoring, machine learning, and predictive analytics to identify risks early on and take automated countermeasures. Maintenance and vulnerability scans run automatically, minimizing human error. SRE principles ensure the quality of AI automation with specific SLOs, confidence thresholds, full transparency, and feedback loops. Error budgets, verifiability, and bias testing ensure that control and trust are maintained at all times.
A mature support and operations team, as described in Diconium's SRE maturity model, moves from a reactive stance to an optimized, strategic partner.
AI plays a crucial role in the transition from reactive to proactive processes: From Level 2 to 3, we use comprehensive observability tools to identify problems in advance. For the leap from Level 3 to 4, the integration of AI/ML is essential: It enables anomaly detection, automatic ticket triage, and the reduction of correlated alerts. We develop our team and our measures by continuously updating scorecards, setting concrete goals, and consistently expanding automation and the use of higher-quality, predictive AI as soon as trust and data maturity are established. We prioritize systemic changes and always align ourselves with the business to justify investments and manage trade-offs between reliability and feature speed.
AI is fundamentally changing the distribution of operational expertise. Centralized AI agents reduce the need for specialist knowledge in regional teams, which become smaller but more specialized and solve complex problems together with AI.
The classic Tier 1 model will be completely replaced by AI in the future. Handovers between shift and global teams will become more efficient thanks to AI-curated summaries, diagnoses, and proposed solutions. Real-time translations and the AI's contextual memory will further improve collaboration.
We see a clear shift from reactive operations to AI monitoring and system governance. Operational excellence is increasingly defined by data quality, model performance, and collaboration—not location. Self-healing scripts further reduce the need for reactive ops teams. At diconium, we rely on a hybrid support model with a centralized 24/7 help desk and specialized experts onshore and offshore to optimize the efficiency gains achieved through the use of local AI assistance systems.