The System Reliability Engineer (SRE) improves and protects the software and systems behind all of T-Mobile's IT services, including management of scalability, availability, latency, performance, security, and capacity, and delivering of software faster, better, and cheaper. From designing & maintaining CICD Pipelines to building the next generation of T-Mobile applications on cloud native platforms, the SRE's enable great customer experience and product innovation by continuous improvement of operational support.
Technology and System
Demonstrates fluency in emerging DevOps-centric automation tools and technologies for CICD, configuration management, etc. for non-prod environments.
Performs environment management, automated server provisioning (VMs).
Delivers software to improve the availability, scalability, latency, and efficiency of T-Mobile's services.
Creates, manages, and uses dashboard for continuous monitoring and health check of applications, and the underlying infrastructure, improve the quality of services using the monitoring feedback for non-production environment.
Contributes in future improvement of software delivery processes and operations, e.g., cloud enablement, use of microservices with containerization.
5+ years of experience in one or more of: C, C#, Java, Perl, Python, Go, or scripting experience in Shell and Perl.
5+ years of experience in Continuous Integration/Continuous Delivery tools, such as, Jenkins, Cloudbees, etc., and other automation tools.
5+ years of experience with DevOps tools, such as, Ansible, Chef, Puppet, etc. Experience in Docker, Kubernetes, etc. is preferable.
Experience in APM tool, like, AppDynamics, logging tool, like Splunk.
Experience working in a cloud environment (public/private).
Experience in migrating to cloud or cloud native environments experience is preferable.
It's easy, and free! Add jobs from any website! Get recommendations from your friends! Start by adding this job...