FreeCAD and undefined behavior in C++ code: meditation for developers
Anastasiia Vorobeva
Posted on October 20, 2023
Examining the project code with the help of a static analyzer, sometimes we wonder how the error appeared and why no one noticed it. Would you like to see such an example? If so, welcome to read the article!
The unnatural
You're reading a classic article where I check an open-source project with the help of the PVS-Studio static analyzer. We regularly post such articles to popularize the static code analysis methodology. Moreover, we strive to improve the quality of checked projects since eventually they contain less bugs.
But I should warn you: this is an unnatural way of using the code analyzer! The analyzer is useful when used regularly and helps detect errors at the early stage. Fixing an error during the development is much cheaper than fixing it at the testing stage or when the user reports about it. This is described in detail in the article: "Introduce static analysis in the process, don't just search for bugs with it".
When writing an article about project check, the authors have another task. They should look through the list of warnings and select some fascinating bugs that show the benefits of static analysis. Such type of articles is useful for learning purposes but have nothing to do with enhancing the development process.
To make it clear, I'd make an analogy between analyzers and compilers.
The analyzer and the compiler warnings are basically the same thing. However, specialized analyzers perform more code checks and use different technologies to detect complex error patterns.
It seems unwise to develop a project with compiler warnings disabled. If you enable warnings every couple month only, this won't be effective:
- You will quickly drown in many warnings and get tired of reading them. As a result, you may mistakenly assume that some useful warnings are false positives.
- You have already fixed some errors when debugging. However, you could have fixed them at the compile time if you had enabled warnings.
- You won't understand what to do with some warnings. For example, they point to redundant code instead of errors. It's unwise to do a large-scale refactoring to fix all such errors. It's also wrong to leave everything as it is and review these code snippets during the next code analysis. So, it's better not to get to this case and gradually refactor the code instead.
The same thing applies to warnings of static code analyzers. It's better not to run such tools from time to time but embed them in the development process.
I guess our regular readers are surprised by so much theory — sorry for that. A new generation of programmers grows up, and they are ready to step on the same rake. Therefore, it's useful to retell such stories over and over again.
The FreeCAD project
FreeCAD is a general-purpose parametric 3D computer-aided design (CAD) modeler. This open-source software application is based on C, C++, and Python. PVS-Studio doesn't support Python at the moment, so I will only check the C and C++ parts. Here is the link to the project code.
It's not the first time we've written an article about the FreeCAD project. We wrote the previous one in 2015. It's been 8 years — quite a long time ago. Seems like a lot might have changed in the project by now, and there's no need to read the previous article, but just in case, here it is.
Let's take a look at the very special bug I mentioned earlier.
Typo and undefined behavior
QGVPage* QGIView::getQGVPage(TechDraw::DrawView* dView)
{
ViewProviderPage* vpp = getViewProviderPage(dView);
if (!vpp) {
return vpp->getQGVPage();
}
return nullptr;
}
The PVS-Studio warning: V522 [CWE-476, CERT-EXP34-C] Dereferencing of the null pointer 'vpp' might take place. QGIView.cpp 592
The author of the code made a typo while writing conditions:
- If the pointer is non-null, the function will do nothing and will return nullptr.
- If the pointer is null, dereferencing will occur, and after that, undefined behavior will appear.
- It's plain and clear. However, I've found this case intriguing for several reasons.
The first reason for meditation. In my previous article, I described an almost identical error! Here is this article: "Simple, yet easy-to-miss errors in code". And here is the code:
if (!SomeGlobalPointer)
{
delete[] SomeGlobalPointer;
}
Well, we've just wondered how developers make such errors, and here we go again. Starting to read the bug report for FreeCAD, I suddenly stumble across an almost identical bug. Guess I catch some kind of vibe :).
P.S. Actually, there's nothing unusual here. What impressed me is that this case shows the Baader–Meinhof phenomenon at work. It occurs when something you know or just learned suddenly seems to appear everywhere.
The second reason for meditation. I wonder how such an error even appeared! The thing is, there's an identical function nearby. And it contains a correct condition! See it for yourself:
QGVPage* QGIView::getQGVPage(TechDraw::DrawView* dView)
{
ViewProviderPage* vpp = getViewProviderPage(dView);
if (!vpp) {
return vpp->getQGVPage();
}
return nullptr;
}
QGSPage* QGIView::getQGSPage(TechDraw::DrawView* dView)
{
ViewProviderPage* vpp = getViewProviderPage(dView);
if (vpp) {
return vpp->getQGSPage();
}
return nullptr;
}
I don't know how that happened.
It's very likely that a developer copy-pastes the second function. But something doesn't add up... It's strange that a developer has fixed the error in the second function, but left it in the first one. I don't believe in that scenario.
After thinking about it, I came up with the following hypothesis. Maybe both functions had an error. Then, someone noticed that the getQGSPage function wasn't working correctly, and then fixed it.
However, this idea was not proven. First, the change history didn't contain any place where both functions were incorrect. The incorrect and correct function appear in the commit at the same time. A developer refactored the code to reduce the number of duplicate code.
Second, these functions are often used equally and occasionally combined. It's strange to notice a bug in the behavior of one function and not notice it in another.
Mystery!
The third reason for meditation. We don't know how an error like this will manifest itself. If the pointer is null, dereferencing it will cause undefined behavior. The condition helps the compiler in recognizing that the pointer is null. I wonder what it will do and what code it will generate.
I know it makes no sense to predict how the compiler will behave and how undefined behavior will manifest itself. Any assumptions will be wrong. By the way, here's a good article on this topic: "Falsehoods programmers believe about undefined behavior".
Nevertheless, I wondered if I could write similar code that would run as correct code due to undefined behavior. Or at least the code that won't crash. This could partly explain why the bug hasn't been noticed yet.
To be honest, I wasn't very good at it. The code was very unstable. The slightest change to the code greatly affected its behavior.
However, it may also happen that the code won't crush. Here's an example of such code.
If the code is compiled in unoptimized mode (-O0), it will be terminated:
Program terminated with signal: SIGSEGV
If we build it with optimization (-O2), it outputs garbage:
1033569176
But it's not crashing. Anyway, I enjoyed experimenting with it. If you wish, you can continue to experiment and share your observations.
Another typos
A classic typo: calling the hasText function for the second time
void SheetTableView::contextMenuEvent(QContextMenuEvent*)
{
const QMimeData* mimeData = QApplication::clipboard()->mimeData();
....
actionPaste->setEnabled(
mimeData && (mimeData->hasText() || mimeData->hasText()));
....
}
The PVS-Studio warning: V501 [CWE-570] There are identical sub-expressions 'mimeData->hasText()' to the left and to the right of the '||' operator. SheetTableView.cpp 1115
The code double-checks that the object can return plain text. I can't say for sure how to write the code correctly. I would guess that the hasHtml function should be called instead of one of the hasText functions.
A typo in the last line
void Document::handleChildren3D(ViewProvider* viewProvider, bool deleting)
{
....
std::vector<App::DocumentObject*> children = viewProvider->claimChildren3D();
SoGroup* childGroup = viewProvider->getChildRoot();
SoGroup* frontGroup = viewProvider->getFrontRoot();
SoGroup* backGroup = viewProvider->getFrontRoot();
if (deleting ||
childGroup->getNumChildren() != static_cast<int>(children.size())) {
....
}
The analyzer indicates an anomaly in the code with two warnings at the same time:
- V525 [CWE-682] The code contains the collection of similar blocks. Check items 'getChildRoot', 'getFrontRoot', 'getFrontRoot' in lines 2404, 2405, 2406. Document.cpp 2404
- V656 [CWE-665] Variables 'frontGroup', 'backGroup' are initialized through the call to the same function. It's probably an error or un-optimized code. Consider inspecting the 'viewProvider->getFrontRoot()' expression. Check lines: 2405, 2406. Document.cpp 2406
I think we are facing a demonstration of the last line effect: the error is most likely to be made at the end of code lines/blocks of the same type.
The code contains three lines that are very similar to each other. Most likely, a developer copy-pasted this code fragment and accidentally duplicated the second line:
SoGroup* frontGroup = viewProvider->getFrontRoot();
The developers need to replace "front" with "back" in the copied line. There should have been two such replacements in the line, but only one was made.
SoGroup* backGroup = viewProvider->getFrontRoot(); // incorrect
SoGroup* backGroup = viewProvider->getBackRoot(); // correct
These typos are hard to find during the code reviews, and static analysis can become a very good assistant.
Freddy's time
What's Freddy got to do with it? This is a reference to the article: "Zero, one, two, Freddy's coming for you". Typos are attracted to code when variable names contain 0, 1, or 2. That's exactly such a case.
TopoDS_Edge GeometryUtils::asCircle(TopoDS_Edge occEdge, bool& arc) { .... }
bool GeometryMatcher::compareBSplines(TopoDS_Edge &edge1, TopoDS_Edge &edge2)
{
....
BRepAdaptor_Curve adapt1(edge1);
BRepAdaptor_Curve adapt2(edge2);
bool isArc1(false);
bool isArc2(false);
TopoDS_Edge circleEdge1;
TopoDS_Edge circleEdge2;
try {
circleEdge1 = GeometryUtils::asCircle(edge1, isArc1);
circleEdge2 = GeometryUtils::asCircle(edge2, isArc1);
}
catch (Base::RuntimeError&) {
Base::Console().Error(....);
return false;
}
if (!isArc1 && !isArc2) {
return compareCircles(circleEdge1, circleEdge2);
}
if (isArc1 && isArc2) {
return compareCircleArcs(circleEdge1, circleEdge2);
}
return false;
}
The PVS-Studio warning: V537 [CWE-682] Consider reviewing the correctness of 'isArc1' item's usage. GeometryMatcher.cpp 242
It's a large piece of code. But that's OK, we'll discuss it bit by bit. Let's start with the asCircle function.
TopoDS_Edge GeometryUtils::asCircle(TopoDS_Edge occEdge, bool& arc) { .... }
Note that the function receives the second argument by reference to return the edge type information through it.
Now look at the following code:
bool isArc1(false);
bool isArc2(false);
....
circleEdge1 = GeometryUtils::asCircle(edge1, isArc1);
circleEdge2 = GeometryUtils::asCircle(edge2, isArc1);
A call to the asCircle function should have initialized 4 variables:
- circleEdge1
- isArc1
- circleEdge2
- isArc2.
A typo causes reusing of the isArc1 variable in the second line. As a result, the isArc1 variable is initialized with the wrong value, and the isArc2 variable remains always equal to false.
So, further checks are pointless (they don't work as intended):
if (!isArc1 && !isArc2) {
....
if (isArc1 && isArc2) {
The full article is here
Posted on October 20, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.