This job has expired, please see additional jobs below
Disaster Recovery Principal Engineer
Comcast
Philadelphia, PA, United States
Job Details - this job has expired, please see similar jobs below
Job Summary:
Responsible for building, managing, operating, and continuously
improving Systems, Storage, Database, and/or Tools Infrastructure that
support Comcast's customer facing applications, back office, and
provisioning infrastructure in a 24/7 environment. Focuses on
architecting, building, deploying, and stabilizing code, services,
systems, and tools. Drives standardization and service focused
instrumentation. Provides subject matter expertise. Resolves break/fix
scenarios, engaging broader teams as necessary; and partners/leads
vendors and regions to achieve continuous improvement. Contributes to
command and control related activities focused on restoration of complex
outages, communication across Comcast, and rapid restoration. Works and
directly leads external vendors, third parties, and associated agencies
when necessary to address issues across the infrastructure. May
participate on 24/7 on-call rotation. Acts as a technical expert in own
area within the organization. May work independently or as part of a
team on more complex projects. Provides mentoring and guidance to more
junior team members. May be responsible for leading a team, but does not
directly manage people.
Core Responsibilities:
• Develops solutions for very complex and wide reaching systems
engineering problems. Sets new policies and procedures to handle future
issues. Creates systems engineering and architectural documentation to
be used by others to build and maintain systems.
• Operating Systems and Disk Management responsibilities (if
applicable): Provides in-depth knowledge of Operating System internals
to aid in troubleshooting complex problems. Acts as an expert on at
least one supported Operating System. Mentors and trains more junior
team members on Operating System concepts, configuration, tuning, and
troubleshooting techniques. Creates complex automation scripts in Shell,
Perl, VBscript, or similar. May manage servers remotely in a
distributed environment.
• Database Platform Management responsibilities (if applicable): Masters
understanding of database concepts, availability, performance, usage
and configuration. Sets up, troubleshoots, and tunes complex standard
and non-standard replication. Uses knowledge of existing database
platforms to evaluate and recommend new technologies. Uses database
knowledge to solve issues on unfamiliar products. Creates and maintains
database policies, standards, and overall documentation including
availability, replication, availability, and backup and recovery policy,
service level agreement, baseline architecture, change management,
access to production, unsupported HW/SW, security and audit violations,
and risk acceptance.
• Storage and Backup responsibilities (if applicable): Masters
understanding of storage concepts, availability, performance, usage and
configuration. Sets up, troubleshoots, and tunes complex SAN software
issues. Uses knowledge of existing storage platforms to evaluate and
recommend new technologies. Uses storage knowledge to solve issues on
unfamiliar products. Creates and maintains policies, standard, and
overall documentation including availability, and backup and recovery,
service level agreement, baseline architecture, change management,
access to production, unsupported HW/SW, security and audit violations,
and risk acceptance.
• Scripting and Development responsibilities (if applicable):-Expertly
develop software in several modern languages. Develops large/complex
database-backed systems and has a solid understanding of DB schema and
query performance. Given a broad set of goals, can create detailed
requirements, technical design specifications, and LOE analysis. Designs
modular systems to be co-developed by teams of less experienced
developers. Designs horizontally-scalable solutions with innovative use
of storage and networking including solid APIs for integration with
other systems. Utilizes professional best practices in day-to-day work
like revision control, unit testing, or other. Applies statistical data
analysis techniques.
• Networking responsibilities (if applicable): Recommends or helps
architect an entire system, including network design and topology. Acts
as an expert in understanding and performing TCPdumps, snoop, and other
network sniffers. Understands and applies knowledge of most protocols
(TCP/IP, HTTP, UDP, etc.)
• Application Technologies (Web Servers, J2EE, Applications Servers)
responsibilities (if applicable): Provides expert recommendations and
advice to the team and/or department in the areas of web services, OS,
and storage, including being an active liaison to Development, QA and
the Business. Provides scaling, design, costing, troubleshooting, and
impact analysis consultation.
• Analyzes systems and makes recommendations to prevent possible
problems. Takes lead on issue resolution activities using knowledge of
complex and company-wide systems.
• Leads end-to-end audit of monitors and alarms based on subsystem
knowledge. Takes the lead on defining the requirements for new tools
required for operations.
• Utilizes time management and project management skills to lead the
resolution of issues in a timely and organized manner, effectively
communicating necessary information. May consult directly with clients
or third party vendors; provides subject matter expertise.
• Consistent exercise of independent judgment and discretion in matters
of significance.
• Regular, consistent and punctual attendance. Must be able to work
nights and weekends, variable schedule(s) as necessary.
• Other duties and responsibilities as assigned.
Job Specification:
• Bachelors Degree or Equivalent
• Engineering, Computer Science
• Generally requires 11+ years related experience
Comcast is an EOE/Veterans/Disabled/LGBT employer