
Senior Site Reliability Engineer (ShopeePay) - Sea Labs
- Jakarta
- Permanen
- kerja tetap
- Building, designing, and implementing high-availability system.
- Improve our system reliability by doing performance tests and chaos engineering practices.
- Writing high-quality, clean, and maintainable code to automate operational processes.
- Maintain and improve time to resolve incidents by improving our observability system and incident management process.
- Lead multiple initiatives such as new system integration, platform migration, and compliance requirements.
- Mentor junior members of the SRE team.
- Bachelor's degree or above in Computer Science, engineering, or related fields
- Having a total of 4+ years working experience, 2+ years as an SRE/DevOps/System Engineer
- Expert in shell scripting and familiar with Python or Go
- In-depth understanding of Operating Systems, computer networking, and system architecture.
- Experienced in CI/CD development and integration. Familiar with Gitlab, Jenkins, etc.
- Familiar with DNS, load balancer, firewall, NAT, etc.
- Familiar with container and container orchestration, such as k8s, including related underlying technology.
- Familiar with other technologies like storage, MQ, cache, elastic search, etc.
- Familiar with commonly used databases, such as MySQL and Oracle.
- Familiar with observability systems such as metrics, logging, tracing, and profiling. Having experience in Prometheus and grafana is a plus point.
- Have a passion for building high availability and performant systems, and care deeply about the end-user experience.
- Able to respond promptly to handle system-related incidents.
- An effective team player with a passion for mentoring junior members.
- Meticulous and attentive to detail with strong critical thinking, data analytics, and problem-solving capabilities.
- Able to communicate effectively in English both verbally and in writing.
- Leadership experience is a plus point.
- Experience in payment-related industries is a plus point.