The Software Engineer’s Never-Ending Nightmare
By Scott Hamilton
Senior Expert in Emerging Technology
I have been in the software and hardware engineering business for more than four decades, at first as a hobby when I was eight years old. I first learned programming from a book on FORTRAN-4 and later wrote many games and other software for the Commodore 64. I can tell you when the nightmare began for me. In the early years of computers, every computer was slightly different as there were several companies all making computers and processors in their own way, and only a handful of standard languages. There was FORTRAN, COBOL and each vendor’s assembly language.
To understand the nightmare, you need to understand a little about programming languages. There are three levels of programming languages, at least in my opinion (there are officially only two.) The two official levels are often called low-level and high-level languages. Low-level languages, like vendor assembly language, is the actual instruction set of the processor and works directly with the hardware, utilizing all the individual components. High-level languages create a more human-readable abstraction of the low-level languages. FORTRAN and COBOL are both considered high-level languages. The easy way to tell the difference is high-level languages require a compiler to translate their human readable commands into assembly language that the processor understands, and low-level languages do not require a compiler. I like to add a third level to programming languages, and for lack of a terminology I will call it extreme-level languages. These extreme level languages are the primary languages in use by software developers today. The main things that set them apart from other high-level languages are object-oriented programming and extensive pre-developed libraries.
FORTRAN IV, one of the first high-level languages, was completely described with all the included functions and libraries in a 158-page manual. Python, the most popular programming language today, also has a core reference manual of nearly the same size at 170 pages, but this is only the basic functions and does not include any of the additional libraries. Once you begin adding the manuals for the full set of supported libraries, the manual grows to the point that it is only available in electronic format as it would take several books to cover the entire subset. Microsoft Visual C++, another popular programming language, offers a printed reference guide that consists of a five-volume set over 3000 pages in total, which also does not include additional supported programming libraries.
This is the beginning of the software engineer’s nightmare. I like to call it software bloat. When a reference manual for a programming language is longer than the complete source code that ran the Apollo space program, you might begin to see the problem. Computers have grown exponentially more powerful since FORTRAN IV, but most of the processing power and newly available memory space is being used to run massive sets of programming libraries. You also run into issues in deciding which programming language best suites your project, what extra libraries to utilize, and how to have continued support in the future. The last one does not seem so important until you realize that a majority of the banking industry is still relying on databases and software written in COBOL for mainframes in the 1950s and is still in use today. There are over 75 programming languages available to choose from and hundreds of thousands of libraries. It is easy to get lost in the weeds and try to make your software do everything.
A great example I like to look at is the example of modern speech recognition. Python has a speech recognition library available to allow you to add speech recognition into your software, but it brings along the ability to recognize 15 languages regardless of whether your project is meant for only English-speaking countries. All these languages take memory and processing power from the computer and make your little program much larger than necessary, about 15 times larger to be more exact. This software bloat creates the nightmare, because the engineer is not only relying on code he or she created, but thousands of line of code written by someone else in the form of a library, and library developers sometimes abandon a project. Now you have thousands of lines of code in your project that you either have to learn how they work to debug them later or abandon them for a new support library. Older engineers truly miss the days when our biggest problems were fitting our code and data in the now tiny four thousand bytes of memory. The early computers could not have contained even the base libraries of C++, and the modern software engineer keeps expecting that the hardware engineers will keep providing more and more memory space for them to work in and have no regard for the size of their software project. Video games in 1995 required 64 thousand bytes of memory and were usually delivered on 640 thousand byte disks. The latest video games require blue-ray disk technology for delivery at over 50 million bytes in size and some require more than one, which is a great example of software bloat.
Until next week, stay safe and learn something new.
Scott Hamilton is a Senior Expert in Emerging Technologies at ATOS and can be reached with questions and comments via email to sh*******@te**********.org or through his website at https://www.techshepherd.org.