One of key definitions of the Scrum framework is “Definition of Done”(DoD). In order to be able to decide when an activity from the Sprint Backlog is completed, the Definition of Done (DoD) is used. This bring all the team members to the same level of understanding about the conditions when a task is completed. The definition of done for our team is as follows:
Any issue in the Sprint Backlog is considered to be “Done” if the analysis, development and validation stages have been completed successfully and the acceptance criteria has been met. The deployment is not a prerequisite for the issue to be considered as “Done”.
Regarding the deployment of the iterations, I should admit that as the overall organization we are not at the right level of agility. Obviusly there are 2 improvement areas of the organization:
- Lack of DevOps practice / culture: The operation team that is in charge of deployment of the functionally we’ve developed is totally isolated from our team. That’s why we need to spend significant effort (SPs) in order to transfer the details, flow and potential impact to the whole system of the increment to the operation team.
- Telco mindset: Turkcell is Turkey’s greatest telco company with tens of millions of customers that is being regulated by the government authority. (BTK) Even though we are in the middle of an agile transformation process and there are more 10 newly built Scrum teams in Digital Services and Solutions function, in order to deploy new functionalities we still need to follow the guideline of the whole organization. This means deployment is allowed only at pre-defined dates and after midnight till early morning. Our team is affected by the billing periods, major operations of CRM and even by the public holidays.
As a general principle, we set Sprint goals so that we arrive continuously to a potentially shippable software at the end of 2 weeks. This doesn’t mean that we deploy software every 2 weeks. Depending on the success or failure of Sprints or maturity of the software me as the PO may decide to skip deployment of the iteration. On the other hand, so far we didn’t had 2 Sprints in a row without shipping software.
We execute test cycles in pre-prod environment. So in theory, there is always a risk of failure scenario in production environment that we didn’t face during test cycles in pre-prod. In order to minimize the risk of failure in “going-live” stage at the night of deployment, a few people from the team ( ca half of team) become present at deployment night, just – in- case.
So far, we had to rollback deployment only one iteration. That was mainly about an issue about connection from our servers to Apple Store and Play Store. Due to the connection barrier in real production environment we had to rollback the operation which was about transferring and improving the life cycle management of our customers buying TV+ subscription packages via in-app purchase. Definitely, a rollback incident is something that needs to be avoided. Following that unfortunate night, at the retro session we have taken a decision to prevent proactively such cases in the future. Let me provide the details of the decision we’ve made in another post.