22.01.2025
Leoni Tischer
Bitsea
The Challenges of Maintaining Legacy Software
A quick, easy-to-understand overview is what many people want in life. Especially with historically grown software systems. Even the developers themselves need a comprehensive overview of the system from time to time, even if the focus during the development phase and afterwards in the maintenance phase is quite different.
Where have monster classes formed? Where are the dependencies getting tangled up? Where are too many classes linked together? Am I at my mental limit or is the code too complex? Such questions can be essential in order to keep older software alive – not as a coma patient, but as an agile, robust pensioner that you only see on Mondays for a coffee party.
Introducing Bisquat2: Simplifying Static Analysis
Our Bisquat2 manages software regardless of its retirement status. Good experience with static quality analysis is usually gained with continuous use of it. The status of the software is checked at regular intervals, so that aggravations can be dealt with on time. Companies often integrate tools such as Sonarqube into the development process. However, our experience teaches us that many developers are unable to interpret the results correctly, often misinterpret them and generally develop an antipathy towards the constant warnings. To the point where many developers eventually ignore the messages completely and switch them off. We know this from real life, except that you end up dragging yourself to the doctor when the physical warning signals become unmistakable. So why not do the same with highly complex software systems?
Bisquat2 is designed to avoid long tool chains and a huge number of consecutive work steps. Upload the code once, select the test tool, check the default values, press the button and off you go. It’s like a quick run-through of medical check-ups to locate and eliminate minor ailments.
The static analysis provides very good insights into the complexity, dependencies, content and quality of the individual source files through various analyzers and then offers the possibility of gaining a system-wide overview with the antipattern and hotspot analyzers. At the same time, images are generated from the data, which also make it easy for the software layman to understand the problems.
Analyzing Content: LOC, NOM, and CR Metrics
The content analyzer is responsible for the largest amount of data – the basic structure. The length of the classes (LOC, “Lines Of Code”) and methods, the number of methods (NOM, “Number Of Methods”), the number of remaining “TODO ‘s or ’FIXME ‘s and the comment/code ratio (CR, ’Comment Ratio”) are calculated here. These so-called metrics, such as LOC, NOM and CR, are also calculated by other analysis tools and there are generally recognized threshold values for assessing the quality of software and calculating anti-patterns.
If metrics such as LOC or NOM exceed the limit values, it can be assumed that the code is unclear and is usually concentrated in a few oversized classes. If “TODO‘s” or “FIXME’s ” are discovered in the code, this usually indicates incomplete or suboptimal, superfluous or even non-functioning code.
The comment ratio, recorded in CR, should be balanced, i.e. not too high and not too low. Comments are important for subsequent developers to gain an understanding of the code. Complex passages naturally deserve more attention. With an overall view of the comment ratio, it is easy to assess whether enough or too much has been done. Meaningful comments are important. Commented-out code, on the other hand, is a problem in itself, which the Analyzer also looks for specifically.
Dependency Analysis: Understanding DIT, NOC, and CBO
Dependencies are examined by our Dependency Analyzer, which also calculates metrics. For example, DIT (“Depth of Inheritance Tree”) shows the maximum length between the node and the root of the inheritance tree. If this value is too high, the inheritance within the classes is too convoluted. Conversely, a value that is too low indicates that object-oriented programming is not being used appropriately.
NOC (“Number Of Children”) is also part of understanding the inheritance tree, whereby the inherited classes are counted one level below the parent classes.
CBO (“Coupling Between Objects”), on the other hand, shows how many other classes a class is connected to, for example through the use of variables or methods. If the number of dependencies becomes too high, the maintenance and servicing of software is severely impeded. In large dependency networks, changes in one class have an impact on almost all classes that are linked to it. The effort involved becomes unmanageable.
More abstractly, RFC (“Response For a Class”) refers to the possible method calls that can be made from a class, including all associated classes. Here too, a high number indicates a lack of clarity and poor maintainability.
Measuring Complexity: Cyclomatic Complexity and WMC
Another important pillar is the Complexity Analyzer in the static analysis. Cyclomatic complexity, nesting and WMC (“Weighted Methods per Class”) are calculated here.
Cyclomatic complexity counts all decision points of the functions in a class, such as all branches of if-else constructs, switch-case statements, for or while loops. Nesting highlights the depth of these decision points. If several decision points and loops merge into one another, complexity accumulates. For example, the nesting for a loop in an if condition would be at level two.
The WMC metric counts the cyclomatic complexity (i.e. the number of linearly independent paths) of all methods of a class together in order to determine the overall complexity of the class.
Quality Analysis: Identifying Magic Numbers and LCOM
The fourth is the quality analyzer. This analyzes the classes for “Magic Numbers”, “Empty-Catch-Blocks”, “Hardcoded Credentials” and counts the number of test methods. The metric LCOM (“Lack of Cohesion in Methods”) is also calculated.
“Magic numbers” indicate hard-coded numbers. If such hard-coded variables occur frequently, the effort required to change them one by one is unnecessarily high. In addition, stand-alone numbers are often more difficult to understand than a constant. It is better to store numbers in static variables that can be reused without difficulty and easily changed in one place. “Hardcoded credentials”, on the other hand, indicate hardcoded passwords or access data that represent a security risk.
“Empty catch blocks” uncover uncaught exceptions that were thrown if something in the process does not work. Exceptions always have a “catch block” that describes what should be done if an exception was thrown in the previous code. If this is empty, the program does not know how the exception should be handled. An entire sequence may simply be skipped and not even an error message indicates what is happening. The LCOM metric is used to calculate the percentage of class methods that use a specific class instance variable. This makes it clear how closely the methods of a class are related or belong together. If the cohesion is quite low, i.e. if LCOM is high, you should create new classes whose methods handle matching tasks.
Spotting Antipatterns: Brainclass, Godclass, and More
With these four analyzers, we deal with the most important metrics that indicate inconsistencies in the code. Our antipattern analyzer then uses these metrics to calculate antipatterns that represent conspicuous and problematic classes. For example, we have the “Brainclass”, which combines too many functions, or the “Godclass”, which is usually a huge class that has become too large and confusing. “Butterfly”, ‘Breakable’ or ‘Hub’ are antipatterns that deal with dependencies between classes and show concentrated, tangled nodes. In a figurative sense, these would be serious “deseases” that require special care so that the ageing “body” runs smoothly again.
The necessary care for software is refactoring, i.e. restructuring, so that it becomes clear and easy to maintain again. So always agile into the next round of the ageing process.
We use many visualizations to present all these issues clearly and simply. Just as doctors now use X-ray images or ultrasound to look inside the human body. Instead of gaining insights from images, we create images based on already generated data and the knowledge gained from it. With our numerous visual aids, data can be explained efficiently and clearly. You can immediately see which classes have software engineering problems.
A selection of these graphics will be presented in this blog series.
Read part one of the blog series here.
Next Post