ENGIE Energy Access is one of the leading Pay-As-You-Go (PAYGo) and mini-grids solutions provider in Africa, with a mission to deliver affordable, reliable and sustainable energy solutions and life-changing services with exceptional customer experience. The company is a result of the integration of Fenix International, ENGIEMobisol and ENGIEPowerCorner; and develops innovative, off-grid solar solutions for homes, public services and businesses, enabling customers and distribution partners access to clean, affordable energy.
ThePAYGosolar home systems are financed through affordable installments from $0.19per day and the mini-grids foster economic development by enabling electrical productive use and triggering business opportunities for entrepreneurs in rural communities. With over 1,700 employees, operations in 9 countries across Africa (Benin, Coted’Ivoire, Kenya, Mozambique, Nigeria, Rwanda, Tanzania, Uganda and Zambia), over 1.2 million customers and more than 6 million lives impacted so far, ENGIE Energy Access aims to remain the leading clean energy company, serving millions of customers across Africa by 2025.
We are recruiting to fill the position below:
Job Title: Site Reliability Engineer / System Administrator
Requisition ID: 49513 Location: Lagos
Job type: Full-time
Job Grade: 15
Department: Digital & IT
Reporting line: DevOps Lead
Job Purpose / Mission
We are seeking a talented and experienced System Administrator/Site Reliability Engineer (SRE) to join our dynamic team.
As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our systems and services.
You will collaborate with cross-functional teams to implement and maintain robust infrastructure solutions, focusing on automation, monitoring, and incident response.
The ideal candidate is passionate about optimizing and enhancing system reliability, possesses strong problem-solving skills, and is committed to driving excellence in operational practices.
Responsibilities
Infrastructure Automation:
Develop and maintain automation tools and scripts for provisioning, configuration, and deployment.
Implement infrastructure as code (IaC) practices to ensure consistency and reproducibility.
Monitoring and Incident Response:
Set up and maintain monitoring systems to detect and respond to performance issues and outages.
Participate in on-call rotations and respond promptly to incidents, troubleshoot, and implement solutions to prevent recurrence.
Performance Optimization:
Optimize system performance through continuous analysis and tuning.
Reliability Engineering:
Implement best practices for reliability, such as error budgeting, SLIs/SLOs, and blameless post-mortems.
Work towards minimizing manual intervention through automation.
System Administration:
Manage and maintain server infrastructure, including installation, configuration, and troubleshooting of operating systems.
Implement and maintain security measures, such as firewalls and intrusion detection systems.
Perform regular system backups and recovery procedures.
Collaboration and Communication:
Collaborate with cross-functional teams to align infrastructure and operational requirements.
Provide technical guidance and support to colleagues in areas related to reliability.
Qualifications
Bachelor's Degree in Computer Science, Information Technology, or a related field.
Proven experience as a Site Reliability Engineer or System Administrator.
Strong Linux and Bash scripting skills.
Proficiency in cloud platforms (e.g., AWS, Azure, GCP, Linode, DigitalOcean).
Experience with container orchestration tools (e.g., Kubernetes, Docker, LXD).
In-depth knowledge of networking, security, and system administration.
Familiarity with infrastructure as code tools (e.g., Terraform, Ansible).
Excellent problem-solving and troubleshooting skills.
Strong communication and collaboration skills.
Preferred Qualifications:
Experience with CI/CD pipelines and related tools.
Knowledge of distributed systems and microservices architecture.
Familiarity with observability tools (e.g., Prometheus, Grafana, ELK stack).
Familiarity with programming languages (e.g., Python, Ruby).