Nisum built a comprehensive SRE Portal Framework.
Clients now have a reliable and scalable SRE portal framework, which measures service availability using machine learning in addition to providing end-to-end visibility across the production ecosystem, leading to:
- High Availability (five nines) of services, which is when the downtime is less than 5.26 minutes per year
- Increased cross-functional collaboration with shared accountability and ownership
A Fortune 500 premium goods retailer lacked a comprehensive framework to measure its digital operations and end-to-end visibility against business and technical KPIs, specifically on eCommerce systems. This led to:
- Impacted customers because:
- Churn between business units created downtime in services
- Manual operations caused a delay in deliveries
- Increased operational expenditures due to a linear increase in team sizes
- Decreased revenue due to downtime
Nisum built a comprehensive SRE Portal Framework to measure everything, understand the current state of Service Level Indicator (SLIs), and define Service Level Objective (SLOs). As a result, they are able to achieve a balance between reliability and scalability over the velocity of feature delivery.
- Developed a spring-boot web application that has responsive web design, using Angular, with built-in schedulers, database connectors, and Rest API integration capabilities to:
- Fetch source data from disparate systems across the enterprise (Internal/External)
- Provide unified 360-degree visibility on business and technical operations
- Analyzed KPIs, in turn, helped SRE engineers to connect the dots and cut down response (MTTA) and resolution (MTTR) times, leading to
- A decrease in the operational budget
- Better focus on automation of menial tasks
“Amazing to see the result of measuring everything over time to achieve operational excellence.”
-VP of Site Reliability Engineering
Feel free to contact us for more information on how Nisum can drive results for your company.