What is a computer

This page introduces the reader to a computer by describing the way computer components work without going into insignificant details from the perspective of a computer programmer. If this page is not informative enough, the reader should look for more detailed information elsewhere.

Bits and bytes

The smallest unit of information is called a bit. Bits may be either set (1) or cleared (0).

A sequence of bits is called a byte. The most common width (length) of a byte is 8 bits. Such bytes are called octets for precise definition, because bytes might as well be defined as having 5 or 10 bits.

For example, the octet 011011102 stores bits which are, in order: cleared, set, set, cleared, set, set, set, cleared. This sequence of bits does not mean anything useful on its own. One needs to define how to interpret such sequences of bits.

A common interpretation is an unsigned binary number. Each bit is assigned a value equal to 2n when set, and 0 when cleared, where n is the bit position. (Positions are counted from the last and starting from 0.) If we read 011011102 as an unsigned binary number, the byte then reads:
0 + 21 + 22 + 23 + 0 + 25 + 26 + 0 =
= 2 + 4 + 8 + 32 + 64 = 110.

Devices and memory

A computer is a set of interconnected computing devices. All of these devices store and manipulate information expressed in bytes. These bytes are stored in a part of the device called its memory.

The most important device in a computer (computing system) is the central processing unit (CPU). CPUs have access to two kinds of memory: registers and system memory.

Registers are small units of memory within a CPU. Width of registers depend on what kind of operation they take part in.

For example, let us define an arithmetic addition operation that operates on 64-bit registers. Their values are interpreted as unsigned binary integers. The operation takes two register arguments, A and B, adds their values, and stores the resulting sum in A.

Such operations are defined by a CPU’s instruction set. The instruction set together with other requirements and defintions compose an instruction set architecture, which is then implemented as a physical device (a specific CPU model).

Processor (CPU) instructions can either embed values directly within an instruction, refer to values stored within registers, or instruct the processor to load a value from a location called system memory.

Physical memory is a memory space divided into bytes, where each byte has an associated address with it. Devices are connected to each other though a message bus. Each device is configured to occupy a certain range of memory addresses. The bus is then used to pass messages from one device to another, most notably to load (fetch) and save (store) bytes within their memory, referring to that memory through these physical memory addresses.

For example, if random acccess memory (RAM) is configured to reside at addresses since 1024 to 10240 and the CPU sends a message to the bus for loading a byte at address 2048, the RAM will respond to that message with the byte that has been mapped to that address. Mapping is the translation of an address from one memory space to another, in this case from the physical memory address to the RAM’s internal address.

Programs

Instructions that a CPU executes are stored in the system memory. When a computer is powered on, the CPU begins to execute instructions stored at some pre-defined address, which usually maps to a platform-specific memory where an initialization program is stored.

Every CPU has a register called the instruction pointer, which contains the address of the next instruction. When an instruction is read, this address is incremented by the length of the instruction, setting the pointer at the address right after it.

Instructions that change the value of the instruction pointer are called jump instructions. Every CPU also implements instructions for checking conditions (such as a test whether a value in a register is equal to 0) and conditional jump instructions which change the value of the instruction pointer only if the condition has been met.

Instructions are grouped into sections called routines. Routines are written so that the CPU jumps from one routine to another. A collection of one or more routines is called a program.

A function is a specific kind of a routine, which jumps back to the routine from which it was called from. If a function is defined to have parameters, it means that the routine expects a specific state of registers at the time its execution begins. The same applies to the return value. Before jumping back, a function stores a value in a specific register.

Which registers are used, how and when is defined by a set of rules called a calling convention. Each program chooses a convention and complies with it, so that other programs (written by someone else) may call its functions.

Operating systems

An operating system is a collection of programs and a set of rules that programs written for the system should all follow.

Within this set of rules is the application binary interface. Part of this interface is the calling convention used by the system. The ABI also defines the layout of data structures used by the system.

The most crucial program in an operating system is its kernel. The kernel has full control over the computer and its main purpose is execution and management of other programs - its tasks. It manages their memory and separates their execution context in order to protect the system againts malicious or errorneous programs.

In a multi-tasking operating system tasks have a time limit. When the task reaches its limit, the kernel modifies the CPU state so that another task may continue from the state it has been previously at. For task switching to occur, the computer needs a system of receiving interrupt requests that instruct the processor to jump to some pre-defined routine which is not part of the current program. Such requests usually come from a countdown timer or some other clock.