We are currently looking for a creative and talented individual with a passion for technology to design, deploy and manage a global Edge computing platform. Your primary responsibility will be to manage a team designing, developing and operating in production a number of distributed global computing systems geared towards services close to end users.
You will work closely with Schibsted engineers to design edge computing systems at a global scale, namely image transformation systems, advanced content caching, encryption termination, traffic routing, multiregion networking as well as generic applications that can benefit from running close to the user (such as token validation, session handling, etc.). Our systems target operations in the number of billions each month and are designed from scratch to handle global traffic at scale, running services both on the cloud as well as on premises.
Be prepared to lead your team to work based on its technological expertise but backed up with hard data. Our systems are global scale deployments of different services such as developer productivity tools, image and message processing systems, big data and mapreduce clusters, database and nosql backends and many more. At all times you will be just a git clone away from real code to contribute to.
We specifically have to support hundreds of services and hundreds of instances for 200M+ external users, using dynamic service discovery systems, leveraging dynamic load balancing and routing. Service to service interaction is done using circuit breaker frameworks and similar techniques. Near 100% uptime is done using deployment techniques such as blue/green or canary releasing. For internal services (like delivery pipelines and build systems), we support more than a thousand developers.
We strongly believe in continuous improvement of always on systems so we relentlessly work to achieve near complete resiliency of everything we do. This means no actual user downtime and seamless infrastructure and service upgrades as well as being proactive to issues.
Stack: Spinnaker (NetflixOSS), Zipkin, Datadog, ELK, Prometheus, Sumo Logic, Java, Travis CI.
- A BSc (or equivalent) degree in Computer Science
- Strong analytical / problem solving skills
- Demonstrable team management, line management experience and agile development
- Experience related to cluster management, high availability and service management systems
- A strong UNIX background (including concepts such as Namespaces, Capabilities, and TCP/IP)
- Proven ability and experience developing highly structured computer programs (C/C++, Golang, Java or equivalent)
- The ability to write scripts on dynamic languages to automate tasks and diagnose problems (Python or equivalent)
- Experience in building and maintaining systems at scale: service discovery, load balancing, secret management, dynamic request routing, circuit breakers and deployment schemes (rolling updates, canary, etc.)
- Experience with modern development tools like Git, Travis, Terraform or equivalent