Assembly Language
Assembly language is a low-level programming language, meaning it tells the computer what to do. Specifically, the language the brain of your computer, which is the processor or CPU, speaks. Programming languages in software development like Java can run on operating systems like Windows or Mac. Assembly language is used for hardware and is unique for each computer processor.
Hierarchy of Programming Languages
High-level programming languages are languages that are designed for the programmer. Meaning code that is human-readable and has a much easier syntax. The most common ones are Python, C++, C#, Java, and JavaScript. The high-level language must be converted into machine code so it goes through an interpreter/compiler. The reason, why the language goes through the interpreter/compiler, is because computers can only understand 0’s and 1’s. So the computer needs something to translate.
Assembly language is the closest language to machine code. The language uses names and codes to simply the 0’s and 1’s. How the assembly language communicates to the computer is through translating to machine code. A program is used called an Assembler to convert the assembly to a machine.
Lastly is machine language, the language that the computer understands. The language is made up of 0’s and 1’s and is impossible for people to understand. Debugging at this level would take forever since words are numbers. Also, remembering number sets than words like “int” for integer or “var” for the variable is way too difficult.
Assemblers
The last section talked about assembly language and how assemblers are used to translate the language to machine code.
Although there are two types of assemblers:
Single-pass assembler:
On the basis of statements called one pass assembler or one pass translation, a single assembler pass is referred to as the complete scan of source program input to assembler or identical representation and translation by the argument. This isolates the mark, mnemonics, and machine operand field. By looking it up in the mnemonic code table, it validates the code instructions. Enter the symbol found in the label field and the machine word address of the available text in the symbol table.
Multi-pass assembler:
In this, an assembler goes several times through the assembly language and produces the code for the object. A synthesis pass is referred to as the last pass and this assembler wants some sort of an intermediate code to produce each pass every time. It is relatively slower than the single-pass assembler, but certain actions can be executed more than once by duplicating means.
The History
The person considered to be the creator of the first assembly language is David J. Wheeler. He was a computer scientist and a professor at the University of Cambridge. He worked on a team that was developing a machine called the Electronic Delay Storage Automatic Computer. This machine was finished in 1949 and was used as a calculator.
Explanation
To understand assembly language, it is best to know a computer’s structure:
So each computer contains a Central Processing Unit (CPU) and Random Access Memory (RAM). The relationship between the two is that the CPU will print out an address that could be read and write and be sent to a certain location in the memory.
RAM is basically your computer’s short-term memory.
Let’s say you wanted to find a book. RAM is a drawer right next to you. It would take you 10 seconds to find that book in that drawer. Pretty fast right? Now, let’s take that away. Your hard drive is a file cabinet. If you wanted to find a book in that file cabinet it would take you 11 days.
So you see now how important RAM is. The CPU handles data pretty efficiently but without RAM your computer would be really slow because your hard drive cannot keep up.
The CPU controls data and has several internal parts to it. The first thing we’re going to talk about in this drawing is registers. CPU has its own internal RAM called registers and they only have a small amount. You can see in this CPU that it has only 4 registers, RA, RB, RC, and RD. Registers store numerical values, storage addresses, or commands.
Then we have the Arithmetic Logic Unit (ALU). The ALU is the main purpose of the CPU which handles addition, subtraction, logic computation like AND’s or OR’s.
Next is the Status Flag register. The Status Flag register is a collection of data that tells us the state of the CPU and ALU.
To the right is the Program Counter. The purpose of the counter is to store the address of what is happening inside our program. As a program is executed the counter goes up.
Lastly are the input and output. The input and output unit get data into and out of the CPU.
This is just the basics of a CPU but extensions could be added based on the goal of the processor. Extensions could contain more registers or having a more complex ALU.
In programming, there is this block of instructions called a function. Functions create an output from its input. Think about it like this, a cow eats grass (input) and then digests it (function) creating manure (output). In assembly language, a function is called an OPCODE, and the following 2 arguments are called OPERAND.
OPERANDS could be registered from the CPU picture like RA and RB, could be memory locations, or numeric values.
Here we have 4 sets of instructions for the CPU. First is the MOVE command and it contains 2 OPRANDs called destination and source. The purpose of the MOVE command is to move information from the source to the destination. The second and third instructions are the OPCODEs ADD/SUB. The first OPERAND is register and the second OPERAND is value/register. The fourth instruction is JUMP and the first OPERAND is the condition at which JUMP executes. The second OPERAND is the location JUMP will be going to.
Currently, the processor can only MOVE, ADD/SUB, and JUMP.
Now, what if we try doing a problem like multiplying 3x10? Well since our processor can only add and subtract we would have to do 10+ 10 + 10 = 30.
This is an example code written in the assembly language. There are the OPCODE MOVE and the two OPRANDS to the right, RA or register A, and the second OPERAND is the source. [3] means memory address 3. So row 10 is saying move data in memory address 3 into register A. Row 12 second OPERAND there is zero with no square brackets. That means move the constant value 0 into register C.
In this example, assume that there numbers already in the RAM to multiply 10x3.
Now going back to the assembly code:
Let’s try to work this out.
Row 10 is saying move the data from memory address 3 to register A so now RA = 10. If you look at the picture with the RAM with values 10 and 3, you can see address 3 has the number 10, and address 4 has 3. The next sequence is reading into the RAM and looking at address 4 so RB = 3. Since there are no square brackets in the second OPRAND, the processor does not need to look into storage so RC = 0.
The ADD OPCODE is taking RA and add it to anything that is in RC. The layout is RC = RC + RA = 0 + 10 = 3. SUB is subtracting RB with the value 1 and not looking into storage. So RB = RB -1 = 2.
This is what we have so far from rows 10–14.
Going back:
Row 15 JUMP sequence is next. Before we get there though we need to learn a bit more about the ALU or Arithmetic Logic Unit.
The ALU takes 2 values which could be the registers or constant values and performs an instruction. Instruction could be addition or subtraction and once the ALU performs the calculations it gets a result. The result that the ALU gets can update the status for example the if-else statement provided.
Look back at the JUMP instruction and it is looking for the condition NZ or not zero. Based on the previous calculation the result was 2. So the condition is true so the program jumps back to row 13.
Now the program is back at 13 and performs the instruction again. RC = RC + RA = 10 + 10 = 20.
The next process is SUB RB so 2–1 = 1.
The condition stays true since it is a one so the program loops back.
RC = 20 + 10 = 30
RB = 1–1 = 0
This is one example of a loop in assembly language since there are other variations in other programming languages.
Now the JUMP instruction does not execute because the last result is zero. So the program moves on to the final step.
Move whatever is in register C to memory address 5.
Why Learn Assembly?
Knowing your computer to the byte level could be useful in some jobs. In Cybersecurity if a hacker attacks your computer at the lowest level, a lot of damage will be done. For example. the attacker can download malware onto the machine and gain information.
Assembly could teach how to write smaller and more efficient programs. Even though everyone nowadays codes in a higher-level language, there could be solutions you may never think about.
Next, code efficiency is almost always important in an embedded system. Generally, modern compilers do a pretty fine job of optimizing code. It is significant, though to be able to grasp what wonders the compiler has done. Otherwise, when debugging, there can be uncertainty.
Advantages
This encourages difficult tasks to operate in a simplified way.
As it needs less memory, it is memory efficient.
As its execution time is lower, it is faster in speed.
It is largely hardware-focused.
To get the effect, it takes less training.
It is used for jobs that are important.
Keeping track of memory locations is not needed.
It is an integrated device at a low level.
Disadvantages
It takes a lot of work and time to write the code for the same thing.
It is really confusing and difficult to grasp.
It is tough to remember the syntax.
There is a lack of software portability across various device architectures.
Long programs written in Assembly Language require more machine size or memory to run.