Link Search Menu Expand Document

Specialties

Team Expertise and Subject Matter Areas

This section outlines the expertise and subject matter areas of the SRE team, which focuses on ensuring the reliability, performance, security, and cost-efficiency of our platforms and services.

Our team’s scope encompasses all the systems we maintain, where we provide customization, or where we integrate off-the-shelf solutions with our own abstractions. We are responsible for maximizing the reliability of our software and services, enhancing the visibility and performance of applications, and ensuring the overall health and efficiency of our infrastructure.

We deliver our impact across four core pillars: Availability, Performance, Security, and Cost.

Our responsibilities include:

  • Platform Management: Maintaining and supporting the platforms we own, assisting developers in their usage, and helping optimize software delivery performance. This includes platforms such as CI (Jenkins), CD (Octopus Deploy), and Logging (Unified-Logging).
  • Service Reliability: Maximizing the reliability of our software and services through proactive measures and collaboration with development teams.
  • Observability: Ensuring comprehensive visibility into our infrastructure and technologies to measure performance, detect problems, and proactively resolve potential issues before they impact the system. This includes managing and utilizing tools like Dynatrace and ELK.
  • Automation: Eliminating toil and improving efficiency through automation at all levels of the infrastructure we manage.
  • Cloud Technologies: Evangelizing and supporting the use of AWS services such as Serverless Framework, RDS/Elasticache, Redshift, and CDNs.
  • Core Infrastructure: Managing and supporting foundational components like Github, Artefact Storage, DNS.
  • Operational Excellence: Focusing on cost optimization, incident response management, and security information and event management (SIEM).

Tools:

The below list illustrates some of the tools we utilize to achieve our goals across our areas of expertise. This list is not exhaustive of the technologies we work with.

  • Platforms:
    • Jenkins
    • Octopus Deploy
    • Unified-Logging
    • Artefact Storage
  • Services & Observability:
    • Dynatrace
    • ELK
    • Incident Response
    • Github Governance
    • AWS Evangelisation
    • Serverless Framework
    • RDS/Elasticache
    • Redshift
    • CDNs
    • Cost Optimization
    • DNS