What is Knowledge Discovery Efficiency (KEDE)?

A metric for measuring the capability of software development organizations.

The ultimate goal of science, as the French physicist Jean Perrin stated, should be "to substitute visible complexity for an invisible simplicity".

The knowledge, skills, competencies and experience embodied in humans is their capital, their human potential. If the knowledge is acquired but not fully utilized then the software developers leave potential on the table.

How can you know if human potential is untapped? How do you know there is potential in the first place? You have no way to get into people's heads and see the potential inside! Truth is you know the potential was really there only after you see knowledge applied. Knowledge workers apply the knowledge.

Utilization of human capital can be measured with the efficiency in acquisition and application of knowledge by the knowledge workers.

Human capital is utilized when applied to create economic value. If the knowledge is acquired but not applied then the human capital is not utilized. To increase the capability of knowledge workers means to increase the utilization of human capital in their organizations.

In the picture below to the right we have Margaret Hamilton, whose code saved the Moon landing space mission of Apollo 11 in 1969, being awarded the Presidential Medal of Freedom by President Obama.

On the picture to the left we have Margaret standing next to part of the computer source code she and her team developed for the Moon landing. What we witness is an instance of the manual part of knowledge work. The software developers had to type all of those pages manually. All the knowledge went from their brains, through their fingers and ended up as symbols on sheets of paper. The symbols are the tangible output of their previously acquired knowledge and the new knowledge they acquired during the work. The symbols are not the knowledge itself, but the trace of the knowledge left on the paper.

Let's approach the capability of software development organizations from first principles. What do software developers actually do? Assuming software developers spend all their time working with no use of social media during business hours then software developers do two things:

  • acquiring the missing information about "What" tangible output to produce and "How" to produce it;
  • producing the tangible output.
“Missing information” is defined by the Information Theory introduced by Claude Shannon[1].

“Missing information” is the average number of “Yes/No” questions software developers need to ask in order to gain the knowledge needed to produce the output.

A software developer needs to acquire the missing information from many sources. At the bare minimum the business should answer questions about what features to develop and the user manuals should answer questions about how the technology to be used. The constant stream of questions asked and answers received is called Knowledge Discovery.

A Knowledge Discovery Process comprises of asking questions in order to acquire the missing information about "what" tangible output to produce and how to produce it.

Capability for software developers means efficient acquisition and application of knowledge. Existing knowledge is the easiest and the fastest to discover - it is there, one just applies it. In other words, when existing knowledge is applied then there is the most efficient knowledge discovery. The more existing knowledge possessed thus less new knowledge discovered the better.

The less missing information when producing the output the more efficient a knowledge discovery process is.

How could an external being know what knowledge in a person's head is new or old? How do you know there is knowledge acquired in the first place? You have no way to get into people's heads and see the knowledge inside! Truth is you know the knowledge was really there only after you see it applied.

KEDE is a ratio between how much knowledge a software developer applied for a time period and a predetermined constant representing an estimated maximum amount of knowledge that could possibly be applied for the same time period.

KEDE is an acronym for KnowledgE Discovery Efficiency. It is pronounced [ke.dɚ].

    KEDE has the following properties:
  • Minimum value of 0 and maximum value of 100.
  • KEDE approaches 0 when the missing information is infinite. That is the case of humans creating new knowledge. Examples are intellectuals such as Albert Einstein and startups developing new technologies such as Paypal.
  • KEDE approaches 100 when the missing information is zero. That is the case of an omniscient being...like God!
  • KEDE will be higher if you don't spend time discovering new knowledge, but just applying existing knowledge.
  • KEDE is anchored to the natural constraint of maximum possible typing speed and thus supports comparisons across contexts, programming languages and applications.

We would expect KEDE of 20 for an expert full-time software developer, who primarily applies existing knowledge, but also creates new knowledge when needed.

KEDE will be higher if you don't spend time acquiring new knowledge, but just apply existing knowledge. In order to increase KEDE in an organization you have to know "what" you are doing and "how" to do the "what". The more knowledge you apply the greater KEDE is. If you start a project in a business domain you know nothing of, with a technology you know nothing of then KEDE will approach zero.

KEDE is low when developers did not possess the knowledge needed or they possessed the knowledge but were prevented from applying it.

KEDE is high when developers do possess the knowledge needed and apply it. The less missing information the better. The less questions the better. Less questions asked means lower level of perplexity for the humans involved.

For that we must be able to count the number of questions asked. Unfortunately, in real settings the questions asked are invisible. They are asked by the knowledge workers either to themselves or to many other people. There is no way to get inside a human's head and count the questions asked while discovering knowledge. The process of discovering knowledge is a black box. We pragmatically adopt the positivist credo that science should be based on observable facts, and decide to infer the number of questions asked solely from the observable quantities we can measure. In reality the only thing we can measure is the tangible output - for software developers that's code.

For that we will turn to Psychology where they study mental effort and cognitive load. Fortunately, there is scientific research that can help! It was done by Daniel Kahneman, who was awarded the 2002 Nobel Memorial Prize in Economic Sciences, for his work on the psychology of judgment and decision-making. In the 1970s Kahneman discovered that mental effort is a volitional, intentional process, something that organisms apply, and as such, it corresponds to what organisms are actively doing and not to what is passively happening to them[3]. Different mental activities impose different demands on the limited cognitive capacity. An easy task demands little mental effort, and a difficult task demands much. Because the total quantity of mental effort which can be exerted at any time is limited, concurrent activities which require attention tend to interfere with one another. Human capacity to perform concurrent perceptual tasks is limited.

As humans become skilled in a task, its demand for energy diminishes. The pattern of activity associated with an action changes as skill increases, with fewer brain regions involved. Talent has similar effects. Highly intelligent individuals need less effort to solve the same problems. The knowledge is stored in memory and accessed without intention and without mental effort[2].

Asking questions is an effortful task and humans cannot type at the same time.

Since humans can either type or ask a question then we conclude: if there was a symbol NOT typed then there was a question asked. Now we have a means to count the number of questions asked during a knowledge discovery process.

KEDE infers the number of questions asked using the number of symbols typed.

KEDE is calculated per software developer. However, no individual software developer ever works in isolation but always interacts with others in an organization. That organization might be a software development team, a project team, or an entire company. To calculate KEDE for an organization you average the individual KEDE of its members. Many people overestimate how much of the knowledge discovery efficiency is a function of the skills and abilities of the individual developers and underestimate how much is determined by the system they operate in.

Since KEDE reflects the capability of software developers in terms of how efficiently they discover and apply knowledge it is possible KEDE be used as an input to performance reviews and salary negotiations. People would say that because KEDE calculation has as its input number of symbols typed per hour it will be trivially easy to be gamed by the software developers. Some people could say that a whole company could engage in an effort to game their overall KEDE standing. Considering the above we should recall the McNamara fallacy and the concept of Goodhart's Law and take a system perspective rather than using KEDE as a single number.

What is the value of knowing KEDE?

Knowing KEDE you can compare capability within and among your company's functional areas and between your company and others. You are now able to unleash your company's untapped human capital by objectively and scientifically answer questions such as:

  • What is the capability of your company compared with the average level in the industry?
  • What is the capability of your team compared with the average level in the company?
  • Did the latest re-org increase the capability of your company?
  • Did the new recruits increase the capability of your team?
  • Did the Agile transformation increase the capability of your team?

KEDE says that if your company is more efficient in terms of knowledge discovery it should produce more working software.

When you start a software development project the best is to establish an iterative process for managing the project constraint which is the Knowledge Discovery Process. The goal is deliberately reducing the perplexity developers face and gaining more knowledge per unit time.

Further reading

Works Cited

1. Shannon, C. E. (1948). A Mathematical Theory of Communication. Bell System Technical Journal. 1948;27(3):379-423. doi:10.1002/j.1538-7305.1948.tb01338.x

2. Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux.

3. Kahneman D. (1973). Attention and Effort. Englewood Cliffs, NJ: Prentice-Hall

4. Deming, W. Edwards. (2018). Out of the Crisis. The MIT Press. Cambridge, Mass