The following RCA (Root Cause Analysis) elaborates on the failure to deliver the POC for Whitebox, which was due on 9th September 2025, to the project owner, Xavier, by full-time contributors Avinash and Milos. Below, we have outlined the problems encountered, along with opportunities to strengthen our systems and processes to ensure timely delivery in the future.
Process Related¶
Our current workflow follows weekly sprints with task delegation, but a critical gap was identified in day-to-day accountability. While weekly check-ins helped track progress at a high level, they were insufficient to identify daily blockers or stalled tasks.
The reliance on developers to self-report progress without structured follow-up led to delays in surfacing issues. When contributors faced technical challenges or were stuck, the lack of immediate escalation mechanisms resulted in lost days of productivity. Similarly, if a developer was unable to make progress for one or more days due to personal reasons, health issues, conflicting responsibilities, etc, these situations often went unreported, further affecting sprint outcomes.
Preventive Action¶
To address this, we propose implementing structured daily check-ins, where contributors self-report progress either on the mailing list or directly on the relevant ticket. This approach ensures that blockers are surfaced quickly, whether technical or personal, and keeps daily tasks aligned with sprint goals.
Alongside these updates, any task that becomes blocked should carry a clear written note on the affected ticket, describing the issue, attempted remedies, and next steps. When personal issues affect work, they should also be reported in the same way, with a documented resolution such as reassigning the task to another contributor or revising the deadline. This creates a transparent trail of accountability and allows the team to respond proactively to delays.
People Related¶
There were instances of skill gaps, particularly in situations where a feature could be implemented in multiple ways or a blocker could be resolved with different approaches. In such cases, Xavier’s input was needed, but their unavailability caused delays and ambiguity in decision-making.
The combination of slow progress, frequent blockers, and unclear direction sometimes resulted in loss of motivation, feelings of being stuck, and eventual burnout. As deadlines approached, pressure increased further, and contributors were unable to rest adequately, yet unable to make meaningful progress, which exacerbated burnout and significantly reduced productivity.
Preventive Action¶
To prevent delays caused by dependency on a single decision-maker, contributors should not wait for Xavier’s availability to move forward. In cases where a decision is needed, developers are encouraged to make the best judgment call themselves and document both the reasoning and the chosen approach directly on the relevant ticket. Xavier can then review these decisions retrospectively, and adjustments can still be made later if necessary.
For truly critical decisions that cannot be deferred, contributors should escalate through Xavier’s urgent email. This balance ensures that work continues without unnecessary blocking, while still keeping the project owner informed and able to correct course when required.
In addition, we will continue encouraging open conversations about workload and wellbeing to surface early signs of fatigue. More WIPs should also be written for features, providing additional clarity and reducing confusion during implementation.
Research Related¶
Since this project involved extensive R&D, unexpected bugs frequently appeared, often requiring significant time to resolve. Features also took longer to implement than initially estimated due to their complexity.
Preventive Action¶
Because this project involves significant R&D, we need to explicitly account for uncertainty and learning curves in our planning and estimations. Estimates should factor in not only the implementation itself but also the time required for discovery, experimentation, and acquiring new skills with unfamiliar technologies. This will help set more realistic timelines and avoid repeated underestimation.
To manage unknowns more effectively, we will prioritize working on uncertain areas early in the sprint and plan dedicated discovery tasks (WIPs) where needed. These discovery tasks should aim to produce small working prototypes that can be reviewed quickly. Even if further polishing is required afterward, reaching agreement on the prototype significantly reduces ambiguity and clarifies the remaining work.
When entirely new tools, frameworks, or integrations are involved, discovery tasks should also explicitly include learning time, with the prototype still delivering something directly useful to the project. Discussions around potential approaches should start early on tickets and WIPs, ensuring that blockers and open questions are surfaced before implementation begins.
External Factors¶
External changes occasionally caused issues with builds, pipelines, or test failures. Resolving these consumed time that could otherwise have been dedicated to feature development, further impacting delivery timelines.
Preventive Action¶
To better handle unexpected issues such as build failures, pipeline disruptions, or external changes, we will adopt a “firefighting” approach. This involves explicitly accounting for firefighting time during sprint planning and treating it as an estimated task within the sprint.
Each sprint, a contributor will be assigned as the designated firefighter, responsible for handling urgent external issues as they arise. This person will reserve a portion of their time for firefighting, ensuring that such tasks do not derail feature development. The amount of time spent on firefighting will also be measured and tracked, so that future estimations can more accurately reflect the real cost of these external interruptions.
This structured approach both limits the impact of unforeseen problems on project delivery and provides clarity to the team about who to reach out to for help when firefighting tasks emerge.
Conclusion¶
The delay in delivering the Whitebox POC was the result of a combination of process gaps, people-related challenges, research complexities, and external disruptions. These factors compounded over time, reducing productivity and ultimately impacting the project deadline. By implementing the preventive measures outlined in each category, we aim to improve accountability, reduce delays, and ensure timely delivery of future milestones.