Link Search Menu Expand Document

Agile

SCRUM Structure & Approach

Storio’s SRE chapter practices a modified SCRUM process, with our EM’s acting as Scrummasters and product owners. Our approach can be summed up as:

  • We’ll use one single Jira project to capture all work the SRE Chapter perform, both project and support.
  • We practice Kanban for Support workstream and we use the Task issue type
  • We practice Scrum for Project workstream and we use the User Story issue type
  • We’ll create two separate Jira Boards, one filtered for work in the project workstream and one based just on the Support workstream.
  • We’ll not set WIP limits on either board, however encourage team members to be conscious of their own work in progress and suggest they limit it to their own effective levels
  • We’ll groom our backlogs once every two weeks.
  • When tickets are uncertain in priority or require an approximate estimation of delivery time, we’ll perform timeboxed spikes.

Ceremonies

Standup

  • We’ll have a daily standup each day at 09:30am London/10:30AM Paris/Amsterdam.
  • This standup will take the format of a round-the-room, with each person offering an update on what they did yesterday and what they’ll do today.
  • Each person’s update should be no less than one minute and no longer than 2 minutes.
  • Sidebars post-meeting are strongly preferred over solutionising during the meeting.

Retrospectives

  • We’ll run a retrospective every two weeks for 1hr, ideally on a Friday afternoon.
  • We’ll complete the bi-weekly sprint.
  • The purpose of the retrospective is to surface actionable items to make our lives better, celebrate successes and reflect on the past two week cycle.
  • An EM will chair the session and picks a retrospective template using tools such as EasyRetro, hiding card contents and voting. Each person gets 6 votes per retro.
  • We timebox 5 minutes for card creation, then 1 minute for voting.
  • The chair will then present the retro board, combining cards on a similar topic and sorting the board by votes. We’ll talk through each card in vote order.

Grooming / Refinement

  • We’ll backlog groom our backlog on a weekly basis for an hour.
  • An EM will chair the session and we’ll look at any new tickets coming in to the backlog, both in terms of bigger support tickets and, where applicable and asked for by a project lead, project tickets.
  • We’ll discuss if a support ticket should transit to be project issue and size it.
  • We’ll size only project tickets with Fibonacci sequence (1,2,3,5,8,13..)
  • To size tickets a tool will be use to calculate story points average. The closest point is chosen and who disagree can open a discussion on re-evaluating the ticket based on his opinions.
  • Refined tickets are added to the next sprint based on their points and the team’s velocity

Support

  • Support lead to be primary and main contact for all issues raised on the support channel.
  • Support lead should mark all ack’d chats with eyes icon or similar to show it is in progress.
  • In the event that an unknown issue is encountered the support lead should raise as a thread on our internal SRE channel so that the solution is loosely documented/viewable to all and then also respond with the learned solution on the support channel
  • Complex issues can be handed over to an SME via the SRE internal channel though the support lead should also document the established fix once understood, whether in a thread or in another suitable platform such as confluence (and reference the doc in thread)
  • Review the proposed approach regularly as a team
  • We’ll operate a single Support channel, #sre-support on Slack which will be the single point of contact for all support issues.
  • Members of the SRE Chapter will rotate on a weekly basis to be the point of contact on this channel, dubbed the person On Duty. Their job is to receive in support requests and either action them in the case of simple or urgent requests, or triage them into a ticket for carrying out later.
  • When the on duty member hasn’t got an incoming ticket, they can pick something off the support backlog to implement.
  • As none of the SRE Chapter know all of the systems in place, we’ll have a named second while knowledge is built up across the team. If a member of the former SRE team is on duty, a member of the former Cloud team will be their second.
  • Any simple request which needs under ten minutes work and doesn’t involve a code commit or the need to leave an audit trail can be done immediately. Anything bigger than that, or if it requires a commit or there is a valid reason to capture the history of the request should have a ticket created.
  • Requests sent directly to individuals rather than through this channel should be gently redirected to the support channel. Those who still insist on coming directly may need a stronger conversation; escalate to your manager if you find you’re getting badgered by DMers.
  • Walk-Ups to people in the office should be treated in a similar fashion; a request to message in the support channel.
  • Support is always best efforts; it’s a partnership with our Engineering colleagues - we’re not there to fix things for them, but to help them get problems solved together, collaboratively.
  • Support and incident response are two different things. Although the SRE on duty may well lead an incident response, it’s not the role of the on duty person to singlehandedly deal with incidents; they’re a team effort.
  • How to manage urgent support tickets If who is on-call realizes that a support ticket is urgent or the reporter declares the ticket is urgent, the person on-call raises the matter in the next Standup and the team will discuss how the ticket should be addressed

    Team PR’s and code reviews

  • We’ll operate a process of code reviews for any work we produce; this is to help us both understand context and to reduce mistakes.
  • Etiquette for requesting a review is to post the review into the #team-sre-internal channel. Emojis can then be used to respond to the request, eg :eyes: to indicate you’re looking at a PR, :thumbsup: to indicate an approval, etc.
  • It is strongly recommended that engineers deploy Judge Dredd to their repositories where appropriate. This automates the process of asking for a review, though you can still additionally ask personally. Plus, he is the law.
  • In the case of urgent reviews, such as a review required for resolving a customer-impacting incident, then the normal review process can be ignored, Judge Dredd being bypassed and the user either skipping review or asking someone to immediately OK the PR.
  • It is not the expectation that reviewees are subject matter experts in a topic; any review input is helpful.
  • It is the expectation that we all take part in the reviewing process, putting aside time in our daily tasks to help one another out with reviews. The mantra is pay it forwards, as you never know when you might need something reviewed!