What is Knowledge Discovery Efficiency (KEDE)?

A metric for measuring the capability of software development organizations.

Abstract

Software development is all about delivering desired outcomes. Developers use their skills, experience, and knowledge to create software that meets the needs of their clients.

However, there's a piece of the puzzle that's often overlooked: the knowledge they didn't have before starting a task. This lack of knowledge impacts the time and effort required in software development.

When the task is completed, the developer has acquired new knowledge that they can apply to future projects. That's where Knowledge Discovery Efficiency (KEDE) comes in. KEDE reflects the knowledge that was discovered during the task.

KEDE definition

"The ultimate goal of science is to explain the complications of the visible in terms of the simplicity of the invisible." ~ Jean Perrin[5]

In 1999 Peter Drucker published “Knowledge Worker Productivity: The Biggest Challenge” where he claimed that:

“The most important, and indeed the truly unique, contribution of management in the 20th century was the fifty-fold increase in the productivity of the manual worker in manufacturing. The most important contribution management needs to make in the 21st century is similarly to increase the productivity of knowledge work and knowledge workers.”[2]

In order to achieve a fifty-fold increase in the productivity of knowledge workers we need to know two things: first, who are the knowledge workers and second, how to measure their productivity?

Knowledge workers are those who need to discover knowledge in order to produce tangible output.

According to the above definition, people who have bought furniture and need to assemble it are knowledge workers. They need to read the instructions and discover how to use the tools to assemble the parts into a tangible output in the form of a piece of furniture.

Software developers are knowledge workers, because they apply knowledge to produce tangible output in the form of source code. In the picture below to the right we have Margaret Hamilton, whose code saved the Moon landing space mission of Apollo 11 in 1969, being awarded the Presidential Medal of Freedom by President Obama.

On the picture to the left we have Margaret standing next to part of the computer source code she and her team developed for the Moon landing. What we witness is tangible output of the knowledge work. The software developers had to type all of those pages manually. All the knowledge went from their brains, through their fingers and ended up as symbols on sheets of paper. The difficult and time consuming activity in creating Moon landing software was not in typing their already-available knowledge into source code. It was in effectively acquiring knowledge they did not already have i.e. getting answers to the questions they already had. Even more specifically, it was in discovering the knowledge necessary to make the system work that they did not know they were missing. This discovery and transformation of invisible knowledge into visible tangible output we call a Knowledge Discovery Process.

A Knowledge Discovery Process comprises of asking questions in order to acquire the missing information about "what" tangible output to produce and how to produce it.

“Missing information” is defined by the Information Theory introduced by Claude Shannon[1]. “Missing information” is the average number of binary “Yes/No” questions knowledge workers need to ask in order to gain the knowledge needed to produce the output.

We introduce KEDE - a new measure for the efficiency of the knowledge discovery process. KEDE is an acronym for KnowledgE Discovery Efficiency. It is pronounced [ke.de].

Software developers apply their knowledge to deliver desired outcomes. KEDE measures the knowledge that they didn't have prior to starting a task, since it is this lack of knowledge that significantly impacts the time and effort required in software development.

If software developers don't have the knowledge needed they have to discover it. The knowledge they have to discover is the total of what they don't know they don't know and what they know they don't know. Prior knowledge is the easiest and the fastest to discover - it is there, one just applies it. In other words, when prior knowledge is applied then there is the most efficient knowledge discovery. Conversely, when a lot of knowledge is missing then the knowledge discovery is less efficient. By measuring the knowledge discovered during the process, we can understand how efficiently the developer was able to acquire and apply the necessary knowledge to complete the task.

More efficient knowledge discovery process doesn't always mean more productive!

It's important to note that KEDE does not directly measure productivity in software development. Productivity in software development is determined by the outcome value produced by the code, and not simply by the amount of knowledge discovered during its creation. The ratio of outcome value produced to knowledge discovered, as captured by the KEDE metric, is what ultimately determines productivity.

More detailed information on how to measure productivity in software development can be found here along with a numerical example.

KEDE reflects the capability of software developers to efficiently discover and apply knowledge.
    KEDE has the following properties:
  • Minimum value of 0 and maximum value of 100.
  • KEDE approaches 0 when the missing information is infinite. That is the case of humans creating new knowledge. Examples are intellectuals such as Albert Einstein and startups developing new technologies such as Paypal.
  • KEDE approaches 100 when the missing information is zero. That is the case of an omniscient being...like God!
  • KEDE will be higher if you don't spend time discovering new knowledge, but just applying prior knowledge.
  • KEDE is anchored to the natural constraints of maximum possible typing speed and the capacity of the cognitive control of the human brain. Thus supports comparisons across contexts, programming languages and applications.

We would expect KEDE of 20 for an expert full-time software developer, who mostly applies prior knowledge, but also creates new knowledge when needed.

Measuring software development

"...my teaching is a method to experience reality and not reality itself, just as a finger pointing at the moon is not the moon itself. A thinking person makes use of the finger to see the moon. A person who only looks at the finger and mistakes it for the moon will never see the real moon."~ Buddha via Thich Nhat Hanh

Knowledge Discovery Efficiency (KEDE) is a general form of a metric for all Knowledge Discovery processes. That means for each specific contact we have to define a specific way to calculate KEDE. For instance, if we do the knowledge work of a surgeon then both the knowledge discovered and the maximum amount of knowledge that could possibly be discovered need to be measured in a specific way.

To calculate KEDE for software developers we must be able to count the number of questions asked. Unfortunately, in real settings the questions asked are invisible. They are asked by the knowledge workers either to themselves or to many other people. There is no way to get inside a human's head and count the questions asked while discovering knowledge. The process of discovering knowledge is a black box. We pragmatically adopt the positivist credo that science should be based on observable facts, and decide to infer the number of questions asked solely from the observable quantities we can measure. In reality the only thing we can measure is the tangible output - for software developers that's code.

For that we turn to Psychology where they study mental effort and cognitive load. Fortunately, there is scientific research that can help!

It was done by Daniel Kahneman, who was awarded the 2002 Nobel Memorial Prize in Economic Sciences, for his work on the psychology of judgment and decision-making. In the 1970s Kahneman discovered that mental effort is a volitional, intentional process, something that organisms apply, and as such, it corresponds to what organisms are actively doing and not to what is passively happening to them[3].

Different mental activities impose different demands on the limited cognitive capacity. An easy task demands little mental effort, and a difficult task demands much. Because the total quantity of mental effort which can be exerted at any time is limited, concurrent activities which require attention tend to interfere with one another. Human capacity to perform concurrent perceptual tasks is limited.

As humans become skilled in a task, its demand for energy diminishes. The pattern of activity associated with an action changes as skill increases, with fewer brain regions involved. Talent has similar effects. Highly intelligent individuals need less effort to solve the same problems. The knowledge is stored in memory and accessed without intention and without mental effort[4].

Asking questions is an effortful task and humans cannot type at the same time.

Since humans can either type or ask a question then we conclude: if there was a symbol NOT typed then there was a question asked. Now we have a means to count the number of questions asked during a knowledge discovery process.

KEDE infers the number of questions asked using the number of symbols of source code contributed.

If there was a way to get inside a human's head and count the questions asked we would have done that. Unfortunately there is no such way. Hence we have to use the number of symbols of source code contributed to infer the number of questions asked.

What is the value of knowing KEDE?

Using KEDE is the first metric that allows for quantifying human capital of any organization, happiness and productivity of and collaboration between knowledge workers.

Knowing KEDE you can compare capability among your company's functional areas and between your company and others. You are now able to objectively and scientifically answer questions such as:

  • What is the capability of your company compared with the average level in the industry?
  • What is the capability of your team compared with the average level in the company?
  • Did the latest re-org increase the capability of your company?
  • Did the new recruits increase the capability of your team?
  • Did the Agile transformation increase the capability of your team?

KEDE says that if your company is more efficient in terms of knowledge discovery it should produce more working software.

When you start a software development project the best is to establish an iterative process for managing the project constraint which is the ability to discover knowledge. The goal is deliberately reducing the perplexity developers face and gaining more knowledge per unit time.

Improving organizational capability using KEDE

According to behavioral sciences capability describes what a person can do in a specific environment. Performance describes what a person actually does in a specific environment. Capability for software developers means efficient acquisition and application of knowledge.

In order to increase KEDE in an organization you have to know "what" you are doing and "how" to do the "what". The more knowledge you apply the greater KEDE is.

KEDE is low when developers did not possess the knowledge needed or they possessed the knowledge but were prevented from applying it.

KEDE is high when developers do possess the knowledge needed and apply it. The less missing information the better. The less questions the better. Less questions asked means lower level of perplexity for the humans involved.

KEDE is calculated per software developer. However, no individual software developer ever works in isolation but always interacts with others in an organization. That organization might be a software development team, a project team, or an entire company. To calculate KEDE for an organization you average the individual KEDE of its members. Many people overestimate how much of the knowledge discovery efficiency is a function of the skills and abilities of the individual developers and underestimate how much is determined by the system they operate in.

It is possible KEDE be used as an input to performance reviews and salary negotiations. People would say that because KEDE calculation has as its input number of symbols contributed per hour it will be trivially easy to be gamed by the software developers. Some people could say that a whole company could engage in an effort to game their overall KEDE standing. Considering the above we should recall the McNamara fallacy and the concept of Goodhart's Law and take a system perspective rather than using KEDE as a single number.

Further reading

Works Cited

1. Shannon, C. E. (1948). A Mathematical Theory of Communication. Bell System Technical Journal. 1948;27(3):379-423. doi:10.1002/j.1538-7305.1948.tb01338.x

2. Drucker , Peter F, “Knowledge-Worker Productivity: The Biggest Challenge,California Management Review, vol. 41, no. 2, pp. 79–94, Jan. 1999, doi: 10.2307/41165987.x

3. Kahneman D. (1973). Attention and Effort. Englewood Cliffs, NJ: Prentice-Hall

4. Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux.

5. Perrin, J. (1916). Atoms. (D. L. Hammick, Trans.) London: Constable & Company Ltd.

6. Holsbeeke, L., Ketelaar, M., Schoemaker, M. M., & Gorter, J. W. (2009). Capacity, Capability, and Performance: Different Constructs or Three of a Kind? Archives of Physical Medicine and Rehabilitation, 90(5), 849–855. https://doi.org/10.1016/j.apmr.2008.11.015