Knowledge Reuse

How code reuse affects KEDE calculation?

Knowledge reuse in software development

“The defining characteristics of good reuse is not the reuse of software per se, but the reuse of human problem-solving“[4].

During software development developers deal with growing amounts of knowledge. Knowledge in software development can be explicit, in the form of a code and artifacts. It can also be tacit.

This knowledge is a valuable asset and as every asset can be reused. This is all to reduce time and cost.

Knowledge reuse is more than code reuse. When we reuse code, we can build a bigger system at the same level of effort, since we can borrow the knowledge without having to learn it for ourselves.

When considering the reusable assets most often is code reuse. But there are other assets that can be reusable[6]:

  • Algorithms: Algorithmic reuse is the reuse of algorithms as a solution every time for the same type of problems that occur.
  • Architecture: The architecture is an organizational structure of a system or component.
  • Data: Data reuse in a particular project makes it easier to achieve continuous processes improvement or in improving the development process. Data in the sense, an experience that is recorded during the previous projects.
  • Design: The key to reusing design is to use the models to capture design knowledge and facilitate the early analysis of system properties.
  • Documentation: A document may contain important information of a project and can be reused for similar projects or next version of the project. Generally new documents are designed which often share features of the old ones.
  • Estimation Template: For estimating the new project in order to forecaster what it takes to successfully complete it,
  • Human Interface: An interface enables information to be passed between a human user and hardware or software components of a computer system
  • Business Domain Knowledge generated during the software development process can be a valuable asset for a software company.
  • Models: A model can depict critical solutions and insights to a problem and hence it can be considered as an asset for an organization. A pattern which explains a recurring problem and solutions to those recurring problems can be expressed as a model. A model is a type of asset which may or may not implement a pattern specification.
  • Modules: Module is a file that contains instructions. "Module" implies a single executable file that is only a part of the application, such as a Linux library.
  • Plans: Plans mean project plans. The parts of the old plans can be reused by the planner for the new versions.
  • Requirements: A condition or capability that must be met or possessed by a system or system component to satisfy a contract, standard, specification, or other formally imposed documents.
  • Service Contracts: Service contract is an agreement between developers/designers and the user of the reusable asset. This is also called reuse contracts. The service contracts acts as an interface for the reusers of an asset. The service contract helps us in guiding how a software asset can be reused, how and why the asset is being reused. This information can be helpful in predicting where and how the system can be tested, what problems might occur and how to rectify the problem, after the system is evolved.
  • Test Case/ Test Design: Test cases that are developed for the previous projects can be reused for the next project and so on. They can be reused many times for different projects belonging to the same family.

What is code reuse?

Coding from scratch takes time and effort. Hence, programmers frequently reuse code[2]. Many times, programming of well-defined problems amounts to simple look-up [3], first in one's own, and then in others' code repositories, followed by judicious copying and pasting.

Code reuse is the use of existing software to build new software. It is one of the holy grails of software development.

In the past, it was common for a company to develop all of the code for any application they produced. Nowadays, with the proliferation of good commercial libraries and open source projects, it makes much more sense to simply reuse code that others have developed. Such projects have been refined, debugged and tested by many developers around the world and have been used in many other projects.

Software development has become modular, with the use of distinct components as the building blocks of an application. The benefit of this approach is that developers don't need to understand every detail of every software component. This can translate into faster development because developers can concentrate on the core business logic instead of spending time reinventing the wheel.

Templates

A “clone” is a snippet of code that is surprisingly similar to code elsewhere, and can arise from copy-pasting practices[5].

In the past 5-8 years, GitHub has emerged as a universal platform for maintaining repositories, with minimal boundaries between projects. Cross-project cloning is prevalent in GitHub, ranging from cloning a few lines of code to whole project repositories. Some of the projects serve as popular sources of clones, and others seem to contain more clones than their fair share. A considerable proportion of all cross-project clones are utility clones, where entire files, directories or even projects are copied with little to no modification[5].

We will refer to such utility clones as "templates".

Templates are one or more text files, or pieces of text which are copied from some place and then added to a text file. In the context of office clerks, an example might be a Tax Return template. In the context of software development, a template could be a ready to use open-source framework. Developers often copy the entire source code of popular frameworks such as Spring and React. Once cloned in different projects, the template files are seldom changed. Templates are mostly static across different projects.

KEDEHub automatically finds and excludes such templates before calculating KEDE.

References

1. Melo, G., Oliveira, T., Alencar, P., & Cowan, D. (2020). Knowledge reuse in software projects: Retrieving software development Q&A posts based on project task similarity. PLoS ONE, 15(12), e0243852. https://doi.org/10.1371/journal.pone.0243852

2. Juergens, E., Deissenboeck, F., Hummel, B., & Wagner, S. (2009). Do code clones matter? Proceedings of the 31st International Conference on Software Engineering, 485–495. https://doi.org/10.1109/ICSE.2009.5070547

3. S. E. Sim, C. L. Clarke, and R. C. Holt. (1998). Archetypal source code searches: A survey of software developers and maintainers. In Program Comprehension, 1998. IWPC’98. Proceedings., 6th International Workshop on, pages 180–187. IEEE, .

4. Barns, B. H., & Bollinger, T. B. (1991). Making reuse cost-effective. IEEE Software, 8(1), 13–24. https://doi.org/10.1109/52.62928

5. Gharehyazie, M., Ray, B., & Filkov, V. (2017). Some from here, some from there: Cross-project code reuse in GitHub. Proceedings of the 14th International Conference on Mining Software Repositories, 291–301. https://doi.org/10.1109/MSR.2017.15

6. Konda, B. M., & Mandava, K. (2010). A Systematic Mapping Study on Software Reuse. Undefined. https://www.semanticscholar.org/paper/A-Systematic-Mapping-Study-on-Software-Reuse-Konda-Mandava/cc46a8ba591880c791cb3425e94a2d6f614ba016

Getting started