This job has expired, please see additional jobs below
Service Reliability Engineer, Intermediate
Entertainment & Media Industry Company
Sunnyvale, CA, United States
Job Details - this job has expired, please see similar jobs below
Yahoo’s Service Reliability Engineering (SRE) team is seeking a talented “Rapid Response Engineer” to play a vital role in a team that runs critical site up operations and systems engineering for Yahoo most popular internet sites including Mail, Messenger, Sports, Finance, Games, News, Entertainment and many others. SRE’s mission is to increase the up time of Yahoo’s properties by providing embedded subject matter experts to restore Yahoo services as soon as possible. The SREs are fast paced and work together to solve complex problems before they impact the amazing experiences our customers have on a daily basis.
About the Role
Do you...
• Have a passion for solving technical problems, from the network layer to the application?
• Spend time trying to figure out how something works?
• Want to make real web applications and back-end systems faster, more reliable, more efficient? This position requires an aggressive troubleshooter who can multitask on problems of varying difficulty, priority and time-sensitivity in order to keep Yahoo’s site up. This versatile position requires familiarity with all the support concepts of busy web sites: Systems and database administration; Networking; Process troubleshooting; QA and rollout automation.
Responsibilities
• Identify the priority and criticality of incoming issues and prioritize appropriately
• Troubleshoot, diagnose, and resolve issues using critical knowledge of Apache, Unix processes, MySQL, and related technologies
• Track issues through the ticketing systems and follow through to resolution
• Utilize monitoring tools to proactively identify issues and trends
• Write clear and concise operational runbooks
• Design, write and deliver software to improve the availability and latency of Yahoo services
• Collaborate with service, network, and other operations teams on significant issues
• Lead by example, deliver results and eliminate missed opportunities
• Ideal candidate will possess a broad range of computer science skills. The candidate must be persistent, result oriented, driven, and possess a strong sense of ownership
Minimum Qualifications
• BS in Computer Science or related technical field, MS a plus
• Proficient working knowledge, including debugging, of Java or C++
• Proficient in designing, analyzing and troubleshooting large-scale distributed systems
• Strong analytical/troubleshooting skills
• Knowledge in a scripting language (e.g. Shell, Perl, Ruby etc) a plus
• Experience with high-volume websites (1 year)
• Basic understanding of Unix/Linux, Apache, performance tuning concepts, and web applications
• Strong written & oral communication skills are essential.
• Strong knowledge of TCP/IP networking, SMTP, HTTP, load-balancers, highly available network servers
• Ability to rapidly learn and assimilate knowledge of complex software and systems, and apply understanding of system architecture when planning operational tasks and strategy