Software Engineering as Engineering (a series)

djmitche

djmitche

Posted on September 17, 2021

Software Engineering as Engineering (a series)

Software engineering is an engineering discipline, joining the ranks of civil engineering, aeronautical engineering, and nuclear engineering sometime in the 1970's. Yet, in many ways it stands apart from those fields, even fifty years later.

This will be the first post in a series looking at how software engineering differs and what lessons we can draw from those differences.

I'm not sure I'm a "software engineer", although people who pay me have given me that title. And I am sure I'm not any other kind of engineer. So, I will probably get a great many things wrong in this series, and I look forward to learning from your comments about those mistakes.

Planning for Maintenance

Let's start by thinking about maintenance. For a bridge or a chemical plant, maintenance involves ongoing monitoring, periodic inspections, and repairs -- both routine and to fix a problem. Long-lived structures are also modified during their lifetime to suit changing requirements; for example, a chemical synthesis line might be retrofitted to use a different, more efficient process.

In most fields, regular maintenance is a part of the design: grease these points daily, and inspect and replace these belts annually. Affordances for maintenance are built into the work: inspection ports, sample collection points, and replaceable parts.

This planning is generally done with an eye toward someone else doing the maintenance: a car mechanic, a facilities technician, or a plant operator. So, designs are made for safe, methodical maintenance without a sophisticated understanding of the technology. A power plant, for example, has numerous platforms, shields, interlocks, and safety processes to keep operators safe.

In some cases, the maintenance affordances are not considered worthwhile. This is how we end up with throw-away plastic doo-dads (the term "consumable" is a way of greenwashing this trend), or white goods that need to be replaced every 5 years. Maybe those are truly not worthwhile from a technical perspective, but I fear in most cases they're not worthwhile from a business perspective -- particularly since environmental externalities are not reflected in business priorities.

Let's get back on track.

Software Maintenance

The situation is similar for software. Organizations monitor software performance (using tools like Datadog, the folks I mentioned who pay me and call me an engineer). Doing so in a comprehensive fashion has been a recent development. I'm old enough to remember when a "web server" occupied 2U in your rack and if you wanted to know what it was doing then you watched its logfiles.

Computing's scale has grown since then, and we must necessarily think carefully about what to measure and how to gather and statistically analyze those measurements to produce meaningful information. We're re-solving some of the problems that other disciplines have solved long ago, including human-centered issues like alert fatigue.

When something does go wrong in software, what happens?

More times than not, the software is old and brittle -- "unmaintained". Maybe an out-of-date library has a security vulnerability or can no longer communicate with an external service, and suddenly years of not upgrading dependencies makes it nearly impossible to upgrade. We should have been performing regular maintenance.

But maintained software grows and changes, too. Priorities change, features are added and removed, and programmers come and go.

A well-engineered system plans ahead for all of this. Carefully designed abstraction boundaries allow parts of the system to be changed independently. This allows engineers working on a specific component to focus on that component, ignoring the rest of the system -- a substantial reduction in cognitive load.

High-quality developer documentation acknowledges that "someone else" will probably be modifying a piece of code, and that person will not have the same knowledge as the person initially writing the code. This is the idea behind "writing code for the reader" and not merely writing code that gets the job done.

Just as industrial systems are designed to be maintained safely, productive software environments spend a substantial amount of engineering effort on safety mechanisms. At its most basic, continuous integration systems will tell other engineers when they've broken a build or caused a test to fail. But a truly well-designed CI system also enforces the less obvious invariants of a system, for example by checking that all configuration parameters are documented. These provide "guiderails" for maintainers that make the correct path the easiest path.

All of this takes extra effort. In non-software fields, the effort is often considered a cost of doing business, or in the case of "consumables" at least considered carefully in the product design. The software industry tends to take a different approach, captured in phrases like "move fast and break things" or "minimum viable product". This approach sometimes has value -- a mobile app for COVID-19 tracking, for example, needs to be developed quickly and won't be maintained for very long (we hope).

But these phrases are often used to excuse massive under-investment in software maintenance. How many projects have released an "MVP", then scaled back development resources before any of the deferred work (tests, documentation, CI, refactoring) could be done? And several years later, after massive efforts at maintenance, how well do those projects fare?

This is where I see a substantial break with other engineering disciplines. There's no such thing as an "MVP" hydro-power dam or "moving fast and breaking things" in the nuclear industry. Historically, that's because the life-safety consequences of software have historically been smaller than those for dams or reactors -- but this is increasingly not the case.

I am glad to see recent developments in software that focus on powerful tools for safety and maintainability and that have minimal costs. The rise of testing CI as a standard practice over the last decade has certainly made software more robust and reliable. And new tools like Rust are providing safety guarantees that even the best engineer, rushed to get a product out on the launch date, can't match.

Next Time

I hope this was thought-provoking. I'd love to hear your ideas for topics to explore. I've been thinking a bit about tooling marks in furniture building ("furniture engineering"?) and what the parallels might be in software. Maybe that's the next post!

đź’– đź’Ş đź™… đźš©
djmitche
djmitche

Posted on September 17, 2021

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related