Tuesday, January 31, 2017

The Nearly Lost Art of Assembly Language

Looking around at all the modern programming languages and methods, one would think we know more about writing good code now than ever before. One would think that we have risen to the point of writing code in ways that preclude the necessity of even understanding how the CPU operates and the operating system functions.

Sadly, we are at that point to a large extent. Ask new grad programmers - I dare not list the degrees specifically given the plethora of degrees that generate programmers - how their lines of code actually work under the covers and you will get stammers or blank stares. Ask them how function calls - and even if you are an object-oriented enthusiast methods are still functions - are handled in context switches. You may get an explanation shows a rudimentary explanation, but ask them for specific CPU register names and you've exceeded their wealth of knowledge.

Once upon a time, universities taught programmers how to write assembly language code. That is no longer the case. You will find it in masters programs as electives, maybe even a short segment of a required class, but you don't find undergrads being taught assembly language code.

In the past, the argument could be made that was because there were far too many processor architectures that had vastly different instruction sets. That meant there was no predominant assembly language with the exception of IBM mainframe assembly language courtesy of the System 370 which a lot of guys my age had to learn.

Some of us also picked up VAX assembly language and maybe even Sparc. Some of us were fortunate enough to have learned assembly language programming for the x86 architecture back in the day.

But the diversity excuse for not teaching assembly language doesn't float anymore. And in for all practical and commercial concerns, there are two assembly languages that could dominate for the role: x86 and ARM.

One could add Arduino to that list now as it's popularity among hobbyists is huge.

With the small device world embracing ARM, its assembly language would be a strong contender. With the thriving popularity of Arduino, that is also a strong contender.

However, the reality is every programmer out there has access to some kind of x86 based PC or laptop. They also have a number of excellent hypervisors that can provide x86 virtual machines to safely hammer away in assembly language in kernel space without the worry of damaging their actual machine.

So that begs the question. With the ubiquitous availability of computers and the gcc suite which compiles x86 assembly language, there really is no barrier to learning assembly language or teaching it in colleges and universities.

There is something satisfying, even elegant, in writing assembly language code. Seeing the machine code representation output from the assembler is like seeing the actual Rosetta Stone (not the software, the actual artifact). You see this indecipherable sequence of numbers (in hex) and its simplest explanation in a readable format.

I like that.

So the next time you are crafting your amazing class structure or using some new programming pattern for handling linked lists or some new multi-way linked trees, take a moment to pause and remember there is machine code under it. And think about that machine code as if it were on a Rosetta Stone that you can read.

Also rememver that every line of high-level code, from the "p = p->next" to the "object.next" iterator, can all be written in assembly language. Yes, it would take longer. And it not only can be done, it has been done.