Negotiable
Undetermined
Undetermined
London Area, United Kingdom
Summary: The role of Network Site Reliability Engineer (SRE) involves managing large-scale production networks and leading critical incident responses. The ideal candidate will have extensive experience in network engineering, troubleshooting, and automation, with a focus on enhancing operational reliability. Responsibilities include owning network incidents, setting technical direction, and collaborating on innovative initiatives. The position requires a deep understanding of network failures and the ability to work in a 24/7 environment.
Key Responsibilities:
- Own critical network incidents — lead response, stabilize service, drive root cause analysis, and prevent repeat failures
- Troubleshoot complex issues across routing, switching, firewalling, and wireless (overlay + underlay)
- Set technical direction for network operations and reliability
- Automate away toil using Linux-based tooling and infrastructure-as-code
- Operate in a 24/7 environment supporting global, business-critical infrastructure
- Work across a multi-vendor environment (Arista, Cisco, Palo Alto, F5, NetScaler, Mist, Aruba, InfiniBand, and more)
- Collaborate on next-gen initiatives, including wireless design and AI / HPC cluster deployments
Key Skills:
- 10+ years hands-on network engineering experience in production environments
- Deep expertise in routing & switching
- Firewalling & segmentation
- Wireless networking
- Strong overlay / underlay troubleshooting skills
- Solid Linux / Unix experience
- Proven experience leading major incidents
- Comfortable working independently and mentoring others
Salary (Rate): undetermined
City: London Area
Country: United Kingdom
Working Arrangements: undetermined
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
We are looking for a battle-tested Network SRE who has owned large-scale production networks and led critical incidents end-to-end. If you’ve been the final escalation, understand how networks actually fail, and use automation to reduce operational pain
What You’ll Do:
- Own critical network incidents — lead response, stabilize service, drive root cause analysis, and prevent repeat failures
- Troubleshoot complex issues across routing, switching, firewalling, and wireless (overlay + underlay)
- Set technical direction for network operations and reliability
- Automate away toil using Linux-based tooling and infrastructure-as-code
- Operate in a 24/7 environment supporting global, business-critical infrastructure
- Work across a multi-vendor environment (Arista, Cisco, Palo Alto, F5, NetScaler, Mist, Aruba, InfiniBand, and more)
- Collaborate on next-gen initiatives , including wireless design and AI / HPC cluster deployments
What We’re Looking For:
- 10+ years hands-on network engineering experience in production environments
- Deep expertise in: Routing & switching
- Firewalling & segmentation
- Wireless networking
- Strong overlay / underlay troubleshooting skills
- Solid Linux / Unix experience
- Proven experience leading major incidents
- Comfortable working independently and mentoring others