Site Reliability Engineer II (Remote)
Agile Lab is a company founded in 2014 with the mission to create value for its customers in data-intensive environments through customisable solutions that establish performance-driven processes, sustainable architectures and automated platforms based on data governance best practices.
Having delivered over 100 successful Elite Data Engineering initiatives, we have used this experience to create Witboost: a modular, technology-agnostic platform that enables modern organisations to discover, value and produce their data in both traditional environments and fully compliant Data Mesh architectures.
With a highly skilled team of over 260 data engineers based in Europe, Agile Lab helps organisations with their data-driven transformation.
Take a look at our handbook to discover our core values and processes.
💼 The opportunity:
We are looking for a Site Reliability Engineer II (SRE II) to join our growing team. You will play a key role in maintaining the reliability, observability, and operational efficiency of enterprise-level distributed systems.
In this role, you’ll coordinate a small technical team (3–4 people) in managing microservices in complex production environments. You will be involved in monitoring, incident management, release coordination, and performance tuning, with a strong focus on OpenShift platforms.
You’ll also work closely with multiple cross-functional teams to ensure high availability and performance of our cloud-native services.
This role includes on-call availability.
💰 RAL:
38.5K-48.5K
💻 Responsibilities:
Ensure high reliability of microservices running in OpenShift environments
Lead and coordinate a technical team of 3–4 engineers for operational excellence
Manage incident resolution and ticketing workflows via ServiceNow
Collaborate with development teams to drive performance optimization and tuning
Design, configure and maintain monitoring dashboards (Grafana, Prometheus, etc.)
Coordinate with Service Control Room to maintain effective alerting and response
Oversee release processes of new features, hotfixes, and updates in production
🛠️ Requirements:
Degree in Computer Engineering, Computer Science, or a related field
Proven experience in Application Maintenance Services (AMS): minimum 2 years
In-depth knowledge of OpenShift and microservices in cloud-native environments
Ability to technically and operationally lead a team of 3–4 people
Experience in release management, monitoring, and incident resolution
Excellent communication and cross-functional coordination skills
Strong initiative, operational autonomy, and results-oriented mindset
Fluency in Italian (mandatory requirement)
Monitoring & Observability: Grafana, Prometheus, Kibana, Jaeger, Datadog, OpenTelemetry
Cloud/DevOps: OpenShift, GitLab, Jenkins
Data & Messaging: Kafka, MongoDB, Ignite
Ticketing & ITSM: ServiceNow
🙌🏻 We offer:
Full Remote or hybrid working in our offices: Milan, Turin, Padua, Bologna, Catania and Rende;
Real work life balance;
Training monthly budget (time and money);
Support of a buddy in the first week of work;
Benefits and corporate welfare programs: company prizes and welcome pack with all the equipment you need to work;
Agile Nomads Experience: opportunity to work for 2 weeks abroad;
Referral bonus, if you bring people as talented as you;
The opportunity to attend one conference per year;
A company rated 4.8 out of 5 for employee satisfaction on Glassdoor and certified as a Great Place to Work
Inclusive environment where you can be who you really are;
Stimulating environment oriented to growth, both professional and personal.
😊 How we work:
We don't like hierarchies: we work as a team;
We don't like bureaucracies, we prefer sense of responsibility;
We like data, certainly, so anything that is measurable;
We want to make a positive change in our industry;
Empathy, humility, collaboration, and willingness to challenge ourselves are the basis of our work.
Please note:
Only candidates based in European time zones (CEST or similar) will be considered for this position;
- Dipartimento
- Engineering
- Role
- Application Maintenance
- Remote status
- Fully Remote
About Agile Lab
📊⚙️ Shaping Data Engineering Strategy and Platform Enablement for Enterprise Organizations
Already working at Agile Lab?
Let’s recruit together and find your next colleague.