We will explain the enduring concepts underlying all computer systems, and show the concrete ways that these ideas affect the correctness, performance, and utility of any application program.
Learning what a system is supposed to do provides a good first step in learning how to build one, so this course serves as a valuable introduction to the students who go on to implement systems hardware and software. But this course also pushes students towards becoming the rare programmers who know how things work and how to fix them when they break.
This course will cover most of the key interfaces between user programs and the bare hardware, including:
The representation and manipulation of information
We cover computer arithmetic, emphasising the properties of unsigned and two's complement number representations that affect programmers. A solid understanding of computer arithmetic is critical to writing reliable programs: for instance, arithmetic overflow is a common source of programming errors and security vulnerabilities.
Machine-level representation of programs
We learn how to read the x86-64 machine code generated by a C compiler. We cover the basic instruction set, and the implementation of procedures, including stack allocation, register usage conventions and parameter passing. We cover how different data structures are allocated and accessed. We also use the machine-level view of programs as a way to understand common code security vulnerabilities, including buffer overflow.
We cover basic combinational and sequential logic elements, and then show how these elements can be combined in a datapath that executes a simplified subset of the x86-64 instruction set. We begin with the design of a single-cycle datapath, very simple but not very fast, and then introduce pipelining, where the different steps required to process an instruction are implemented as separate stages and can be executed in parallel. The final five-stage processor pipeline will be closer to modern architectures, and we will show how a programmer can speed up his code by by increasing the instruction parallelism hidden in his programs.
The memory hierarchy
The memory system is not a linear array with uniform access times. In practice, a memory system is a hierarchy of storage devices with different capacities, costs, and access times. We cover the different types of RAM and ROM memories and the geometry and organisation of magnetic-disk and solid state drives. We describe how these storage devices are arranged in a hierarchy, and how this hierarchy is made possible by locality of reference. We will show you how to improve the performance of application programs by improving their temporal and spatial locality.
Exceptional Control Flow
Here we step beyond the single-program model by introducing the general concept of exceptional control flow. We cover examples of exceptional control flow that exist at all levels of the system, from low-level hardware exceptions and interrupts, to context switches between concurrent processes, to abrupt changes in control flow caused by the receipt of kernel signals, to the nonlocal jumps in C that break the stack discipline.
Virtual memory space is just an array of bytes that the program can subdivide into different storage units. However we will show how different simultaneous processes can each use an identical range of addresses, sharing some pages but having individual copies of others. This helps the programmer to understand the effects of programs containing memory referencing errors such as storage leaks and invalid pointer references.
The TDs will illustrate how to put the above at work in everyday programming practice. Among other things we will reverse-engineer a binary program, implement a buffer overflow attack, optimise a processor design, and implement our own memory allocator.
Student's background We assume that the student has familiarity with programming. Experience with C or C++ languages is a plus, but If your only prior experience is with Java we will help. We do not assume any prior experience with hardware, machine language, or assembly-language programming.
Language Lectures will be in French or English, as requested by the students.
Exam and grading Labs are graded + a final written exam.
ECTS credits: 4
- Profesor: Francesco Zappa Nardelli