Wildfly Transition to Kanban

A case study of a successful Kanban transformation.

Motivation

More too often the success of an Agile transformation is judged using only metrics such as increase in the throughput of user stories, increase in velocity of story points, reduction of the average time needed to deliver a user story. Balancing the supply with demand for features usually also counts as a success.

Each of those approaches accounts for some important facet of software development. However all of the approaches can actually be applied to manual work just as well! That is because they don't measure what is specific to software development only - its essence, which is the acquisition and application of knowledge.

In what follows we take a look at WildFly's transition to Kanban. Originally called JBoss, the project went through several significant changes, including an acquisition by Red Hat and a subsequent rebranding to WildFly. It provides a unique case study across three distinct periods: startup, waterfall, and Kanban. Each period featured a different level of Knowledge Discovery Efficiency (KEDE). The highest KEDE occurred during startup days, dropped significantly during the waterfall period, and, while it increased with the introduction of the Kanban method, it never quite reached the initial startup levels.

Background

In 1999, Marc Fleury started a free software project named EJB-OSS implementing the EJB API from J2EE. Sun Microsystems asked the project to stop using the trademarked EJB within its name. EJB-OSS was then renamed to JBOSS, then JBoss later. JBoss has had two different versions; one named JBoss AS (Application Server) and JBoss EAP (Enterprise Application Platform). JBoss AS was the open source community version of the application platform that allowed development teams to use JBoss free of charge. JBoss EAP was also an open source project.

On April 10, 2006 Red Hat announced that it has entered into a definitive agreement to acquire JBoss[14]. On June 5, 2006 Red Hat completed the acquisition of JBoss for US$420 million and integrated it as its own division[7].

However, while JBoss EAP was free to use during development, it required a fee when deployed to production. In late 2013, JBoss AS code was rebranded and renamed to WildFly[8]. On November 20, 2014, JBoss Application Server was renamed WildFly. This left their last supported JBoss AS version as JBoss AS 7.1.1.Final. All future releases are named WildFly.

While JBoss EAP source code is based off of WildFly they are not identical and are not completely interchangeable. They differ in many ways, including security patches and overall functionality. JBoss EAP is just a commercial build of the Wildfly project. In many ways, especially from a source code perspective, JBoss and Wildfly are the same thing[5]. Just as every new release of JBoss EAP was once built from JBoss AS, today, every new JBoss EAP release is built from WildFly.

Initial state

Red Hat had a dual model with Fedora the free Linux and Red Hat Enterprise Linux as a supported product. Hence they split the JBoss application server in two - WildFly the community project and JBoss Enterprise Application Platform (EAP) the supported product. They did that with a distributed team spread in 15 countries 21 native languages.

The manager joined the open source development of JBoss in 2001. Later on in 2004 he became an employee[11]. According to him before the Red Hat acquisition the team followed extreme programming principles. Red Hat had a different model - pretty much waterfall[3].

There was a planning phase where engineering, product management and quality assurance come together and have to agree on what they're going to do in the next release. After that there were a series of builds so every two to three weeks they cut a release. It can be either alpha quality developer build or better quality maybe to give out to customers to try out. Then they did a release candidate then possibly a final release. On top of that they had the situation where they might still develop a minor release stream, let's say EAP 6.3 and at the same time work on the next major version, say EAP 7. All those efforts can run in parallel[3].

What was wrong with this? Due to this dual minor/major cycle they might end up with very long cycles for the major version. It could be three years!

They had a horrible problem with the requirements. They had to find a way to collectively iterate over a requirement and make sure everyone understood what was to be done. In some cases they could go as far as implementing a proof-of-concept before they would be certain they fully understood the problem and the proposed solution.

With EAP 7 they had the situation where the planning phase was open for 14 months. What happened was the product management came initially with a set of specifications. Of course within 14 months things changed so they came back and asked for new features to be added and some already developed features to be dropped. For every such request the developers had to go back to the code, provide new estimates and change existing code. Quite often the requirement was very vague so that was a problem. The product managers that came to the WildFly team came from the proprietary world so they didn't really understand how open source works. Having all that in mind, sticking to a release date was pretty much impossible. The manager had a heuristic - the golden rule of seven. It was: the moment you think you are done you need seven more builds to finish[3].

Below we have KEDE presented for the period till the release of EAP 7 (WildFly 10) in May 2016. For this period they had an average d KEDE=2.5. However, for the whole period where we have a peak in 2011 and a significant low starting from 2014.

Below we have KEDE presented for the fourteen months before the release of EAP 7 (WildFly11) in May 2016. The knowledge discovery efficiency has dropped 36% compared with the period before the acquisition. For this period they had an average KEDE=1.6.

Kanban at Wildfly

Sometime after EAP 7 was released in May 2016 it was decided to tackle the problems of:

  • Long release cycles leading to insane planning times
  • Cases in which requirements could be misunderstood or miscommunicated and we found that out late in the cycle.
  • Tracking the release progress was hard
  • Sticking to release schedules even harder

The problem the manager was trying to solve was:

  • Simplify planning and reduce planning time. Ideally the manager would eliminate planning.
  • Reduce context switching, avoid overloading
  • Be very specific about hand-off points & deliverables
  • Improve requirements analysis and prioritization
  • Provide enhanced visibility into the state of the features for the project

WildFly was a project very much advertised within Red Hat. Everyone had an opinion, everyone knew what's best for the project except for the people who actually worked on the project. Of course at that point the pointy hair boss comes with a solution - we have to go Agile![4] And when the boss says we have to go Agile we know what that means right - they had to use Scrum.

The manager consulted with the other teams at Red Hat which had already been using Scrum. It looked like the manager probably would need to organize six or seven Scrum teams for EAP and every team would need to have a Product Owner and a Scrum Master. There was no way to clone the product manager because he was just one so most developers probably would have to be the product owners themselves. They had to train some engineers to rotate as Scrum masters. Then they probably would have needed the Scrum of Scrums to run the whole thing so maybe another ten people to adopt scrum. At that point the manager knew there's no way to get ten people just to "We go Agile". For the manager that is like a show stopper. The manager was concerned about how Scrum would do with the planning fallacy. People overestimate their abilities to understand complexity and what's coming. His thesis is that for every truly complex system you just have no clue you know what's going to happen after a couple of sprints. His view is that when you have a serious system when it comes to correctness versus time correctness wins. So rather than spending your time estimating when you'd be done, maybe you just work on the problem and solve it. For the manager, planning for this type of problem is a waste of time.

The manager very much preferred an asynchronous style where they have good people work on the tasks prioritized and they pull work. A system that balances its capability and the work requested and thus achieves optimum throughput. The manager was not an agile coach nor a Kanban expert[3]. For the manager Kanban seemed particularly interesting because it followed a similar processing model and it had some unique features that to him looked very attractive. One can bold Kanban on to the existing process and don't have to change the process initially and thus respecting the team rules and the team dynamics that have developed over time. Kanban takes less resources to implement - for the whole project the manager had one person to be the process master. For the manager Kanban helps you visualize work so you can tell those in the business if you want to see the status check the board.

The manager had two main concerns with Kanban - the number of states the work goes through and the sequential form of the existing Jira board. They found some 23 states which were impossible to fit in a board using the existing Jira Agile plugin at the time. It was impossible to visualize a very complex system with the tools that Jira offered. They thought they really needed some way to model parallel work so they came up with the idea to have parallel tasks and those tasks would have their own State. After failing to find something appropriate for their requirements, Kabir Khan, WildFly/EAP Core developer who has also (very successfully) played the role of a release coordinator, came up with the idea of developing a custom plugin for JIRA that could solve those problems[12]. So the team developed Jirban - their own Kanban plugin for Jira[9].

First Agile release was JBoss EAP 7.1 (WildFly 11). For this release the manager wanted to involve Quality Assurance earlier in the process and not at the end. He wanted to agree on what will be delivered before implementing by iterating over a document. They didn't care about the form of it, they just wanted to agree that they had a requirement that was useful enough and the team was able to implement. He also wanted to encourage the developers to write more community documentation. At the same time to make the status of the release more visible to upper management or whoever wanted to check status.

When they finished with this first utilization of Kanban they saw the planning went smoother and the analysis phase really helped. People could see the status of the project. Parallel tasks really helped to model reality - in many cases tests development happened earlier. So in this regard it was a success. Still there were some problems - development time was small relative to the other phases which they didn't like. Once you merge something you have to fix it - there's no easy way way out. Testing still happened late for some of the features. Some people took the process too literally e.g. especially in Quality Assurance.

Below we have KEDE presented for the period from the release of EAP 7 (WildFly 10) in May 2016 till the release of EAP 7.1 (WildFly 10) in December 2017. The efficiency of knowledge discovery improved 25%. For this period they had an average KEDE=2. We have to keep in mind that the team spent time developing Jirban. For that effort we have data too.

For the next release JBoss EAP 7.2 (WildFly 14) there was an interesting new requirement - somehow to deliver in shorter cadences for OpenShift cloud offering. The goal was to bring a process that was often taking something like 18 months down to three. Six times reduction - huge difference! It would look something like this - new delivery stream based off of WildFly quarterly releases. Each WildFly release is time-boxed in three months. Every 3rd or 4th release rolls into a traditional EAP bare-metal release with a long support cycle. To make this possible they switched to feature branching. Everything was developed on special branches by teams that were assembled dynamically.

The way teams were organized was like that: developers come together, call the QAs, then come together with the documentation person. The team work in isolation and have to complete all requirements before they're allowed to merge back to the main line. The process was more parallelized and the work could move between states freely.

They also renamed Jirban to Overbård and added more features to it[10].

WildFly team produced their first product release with this system in April, 2018 based on WildFly 12 from February 28th, 2018. So they proved that they can do it. Regressions were reduced a lot. The magic rule of seven was proven wrong for the first time. They only needed two extra builds to finish the release - a huge improvement. There was some overhead and some people complained about it but when the manager told them: "you know the alternative is to do Scrum" - and they got it.

The results according to the manager:

  • Kanban can handle complex projects with some modifications
  • Kanban helps focusing on improving work processes
  • Proper tooling is necessary for success
The manager thinks planning is dead when a team works in very short iterations it just doesn't matter. It takes courage to do the right thing - choose and evolve a process which is right for your team[12]..

Below we have KEDE presented for the period from the release of EAP 7.1 (WildFly 11) in December 2017 till the release of EAP 7.2 (WildFly 14) in January 2019. The efficiency of knowledge discovery remained the same as the previous release. For this period they had an average KEDE=2. It looks like they got a very slow start of 2018, but after that they got back to the level.

Conclusion

On the below diagram we see all periods in the life of WildFly.


The waterfall period up to 2015 and then the Kanban period up to 2021. The knowledge discovery efficiency (KEDE) is different in each of the periods. It was the highest during the startup days, then dropped significantly. After that with the help of the Kanban method KEDE increased but never got back to the old startup levels. In all periods what we measured using KEDE was in line with the actual events as reported by the manager.

We can conclude that KEDE correlates with what the coaches and the managers reported.

Physicians use a thermometer in order to get an understanding about what is going on inside the black box that is a human organism. You can use KEDEHub the same way - to get visibility into the black box that is an organization developing software.

The present day attitude toward measuring knowledge work productivity very much resembles how temperature was understood 400 years ago[13]. It is argued that the manager's touch captures information richer than any tool. However, we don't agree. It is time to start measuring the productivity of knowledge work by its essence, which is the acquisition and application of knowledge. That is what the one and only scientifically based metric - knowledge discovery efficiency (KEDE) is used for!

By acquiring a deeper understanding about how efficient the knowledge discovery process is we can tell if there are constantly changing requirements, a mismatch between skills applied and the work requested, a broken communication with the business etc.

Managers and decision-makers , who want to unleash the untapped potential in their organization, can use KEDEHub to get visibility into the black box that is an organization developing software with our scientific and non-intrusive method.

Appendix A: Methodology

We have included repositories that are exclusively hosted on GitHub for this release of the report. Repositories that have been forked from other repositories have been excluded from our analysis such that some genuine development activities may not have been included. Some projects may be affected more than others. In this report, we have included activity not only of the main branch, but also of all other branches. Hence commits that have not yet reached the main branch or are for any reason kept out from the main branch have been included.

List of analyzed repositories:

References

1. Kanban @ Red Hat JBoss EAP: Supercharged Agility with Kanban

2. PDF Kanban @ Red Hat JBoss EAP: Supercharged Agility with Kanban

3. Going Agile with Kanban - the Red Hat JBoss EAP

4. PDF Going Agile with Kanban - the Red Hat JBoss EAP

5. WildFly vs. JBoss EAP: How these Red Hat application servers differ

7. Red Hat Completes Acquisition of JBoss

8. As part of the rename of the JBoss Application Server project to WildFly, the jboss-as git repo has been moved

9. Jirban

10. Overbård

11. Dimitris Andreadis

12. Using Kanban with Overbård to Manage Development of Red Hat JBoss EAP

13. Why Doctors Reject Tools That Make Their Jobs Easier

14. Red Hat Signs Definitive Agreement to Acquire JBoss

15. JBoss EAP 7 Maintenance Schedule

Getting started