Environment Stability - RCA
ETL job not being executed for 1.14 pre-prod deployment - steps were missed though provided in the deployment tracker
Swarm update due to change in a particular node by Azure
Refactoring of the variable values (500 variables to 60 variables), Communication to the larger group was not done.
Peer-review on pre-prod was not for validating this change
Possible risk in going to production
Suggested resolution
Can there be a peer review post deployment ?
Big refactoring activities DevOps changes could be done in off peak hours, engineering must be involved to help validate. Communication must be done to all the stakeholders before hand.
Can a dry run on a new instance happen ?
Can there be a dev ops process kit for every env ?
Can there be a common deployment request tracker ?
Can there be only weekly two times deployment requests for devops
Can Jira be leveraged for deployment ?
How can there be stories that are moving sprint to sprint be minimised ?