This job has expired, please see additional jobs below
Service Reliability Engineer, Interm
Yahoo!
New York, NY, United States
Job Details - this job has expired, please see similar jobs below
Description
A Little About
Us Yahoo’s Senior Operations Engineering team is seeking a talented individual to play a vital role in a team that runs critical site up operations and systems engineering for Yahoo most popular internet sites including Mail, Flickr, Sports, Finance, News, Video and many others. Tier 2’s mission is to increase the up time of Yahoo’s properties by providing embedded subject matter experts to restore Yahoo services as soon as possible. The Tier 2's are fast paced and work together to solve complex problems before they impact the amazing experiences our customers on a daily basis.
A Lot About You
Do you have a passion for solving technical problem from the network layer to the application? Do you spend time trying to figure out how something works? Do you make real Web applications and back-end systems faster, more reliable, more efficient? This position requires an aggressive troubleshooter who can multitask on problems of varying difficulty, priority, and time-sensitivity in order to maintain Yahoo’s site up. This versatile position requires familiarity with all the support concepts of buys websites: systems administration, basic networking, and troubleshooting.
Your Day
• Identify the criticality of incoming issues, prioritize, and troubleshoot appropriately
• Triaging and driving user and revenue impacting incidents to resolution, using critical knowledge of Apache, Unix processes, MySQL, and related technologies within the OSI stack
• Track issues through the ticketing systems and follow through to resolution
• Utilize monitoring tools to proactively identify issues and trends
• Write clear and concise operational runbooks
• Escalate significant issues to service, network or other operations engineers
• Lead by example, deliver results and eliminate missed opportunities
• Ideal candidate will possess a broad range of computer science skills. The candidate must be persistent results oriented, and a self-starter
You Must Have
• BS in Information Systems, Computer Science, or equivalent industry experience (Troubleshooting experience in large scale production environments (2 years))
• Exposure to tool/product development is a plus
• Knowledge of Unix/Linux, Apache, and Web applications
• Strong written and oral communication skills are essential. Ability to document/report bugs, test cases, and problem reports
• Ability to rapidly learn and assimilate knowledge of complex software and systems, and apply understanding of system architecture when planning operational tasks and strategy
• Demonstrate knowledge in one or more of the following programming languages: Ruby, Perl, Python, or Chef
• Experience with statistical analysis of defects and system performance a plus
• Understanding of TCP/IP, SMTP, HTTP, load-balancers, and network server concepts