OS & CPU Architecture

Oct.06.2017 | 23m Read | ^DevOps

Computers (math machines) use a system of processors (math engines) and memory (info or data storage) to run software (instructions). The Operating System or OS (foundation software) acts as the middleman between the hardware (machinery) and other programs (software units).

Software features include remote or distributed deploy, custom embedded devices, and multiprocessing by juggling hardware resources. We shall start at the beginning from machine off and cover all the way to remote cloud computing and Artificial Intelligence (AI).

Learning anything is like painting a fence, it takes multiple coats or layers. However, a good primer saves a lot of effort and acts as paint until those other coats go down. Clocking in at a 22ish-minute read, this Computer Science primer will save you hours (if not years). Time to dive in!

Pre-Boot:

▼ BIOS (Basic Input/Output System):

    • ☑⁡⁡ Instructions communicate with each other using the settings inside the CMOS ("see-moss" or Complementary Metal Oxide Semiconductor).
      ☑⁡ On the motherboard (or mainboard which other boards and devices connect to).
      ☑⁡⁡ Uses a flash chip ROM (Read-Only Memory is slow, but does not require constant electricity).
        ◆ Flash ROMs typically last 10 years.
        ◆⁡⁡ Wears out due to data overwrites (flashes).
        ◆⁡⁡ Interrupting a flash (data write) can permanently destroy the device, bricking it. The unresponsive hardware then becomes 'as useful as a brick'.
  • ▼ CMOS (Complementary Metal Oxide Semiconductor):

    • ☑⁡⁡ A board chip that stores BIOS instructions.
      ☑⁡⁡ Uses Static Random Access Memory (SRAM) which is faster than ROM, but requires electricity (the BIOS often has 128 bytes of memory on-chip, kept alive by a wrist watch-like battery).
      ☑⁡⁡ This CMOS battery lasts 3-5 years, then resets the BIOS and your settings (including date and time) to default every restart (until replacement).
      ☑⁡⁡ Many boards have a backup CMOS for alternate settings and in the case of data corruption.
        ◆ Set by jumper pins (removing or 'jumping' plastic covers on board pins).
  • ▼ DRAM vs SRAM:

    • ☑⁡⁡ DRAM (Dynamic Random Access Memory): memory like a tiny grid of jail cells for electricity (capacitors). The jail doors are transistors. By opening and closing these transistor doors, the capacitor cells record a result in binary or binary digit numbers (empty '0' or full '1').
        DRAM capacitors leak and must be dynamically refreshed (or refilled).
        ◆ A cheap form of RAM, it can be a board chip, but often sold as a 'stick' of RAM (a card that plugs into your computer by slots).
        ◆ Varieties are many, including: SDRAM (Synchronized DRAM) or DDR (Double Data Rate SDRAM). Graphics processors often use VRAM or Video RAM (like GDDR or Graphics DDR SDRAM).
      ☑⁡⁡ SRAM (Static Random Access Memory): like DRAM, but uses 4+ transistors (doors) per cell to form a flip-flop. This allows for faster changes to the cell during operations (like reading or writing data).
      ☑⁡⁡ Memory sizes: a single cell stores a bit (b). 8 bits form a byte (B) or o for octet (like 10011010 in binary is 154). 1000 bytes (1024 to store in binary) is a kilobyte (KB and is k^1 in relative storage). This KB pattern continues on below (NOTE: k means 1000, but is actually 1024):
        ◆ 1 bit = a unibit (b or u is k/k).
        ◆ 2 bits = a dibit (d is k^0.1).
        ◆ 4 bits = a nibble or quadbit (n or q is k^0.2).
        ◆ 8 bits = a byte or an octet (B or o is k^0.3).
        ◆ 1000 (1024) o or B = a kilobyte (MB is k^1).
        ◆ 1000 (1024) KB = a megabyte (MB is k^2).
        ◆ 1000 (1024) MB = a gigabyte (GB is k^3).
        ◆ 1000 (1024) GB = a terabyte (TB is k^4).
        ◆ 1000 (1024) TB = a petabyte (PB is k^5).
        ◆ 1000 (1024) PB = a exabyte (EB is k^6).
        ◆ 1000 (1024) EB = a yottabyte (YB is k^7).
  • ▼ Registers (the fastest RAM):

    • ☑⁡⁡ Registers inside the processors, load from caches of SRAM to process data (this leads to loading programs into memory and running them).
        Call stack or The Stack (memory): a software stack (data structure or data storage) that pops-off and pushes (or piles on top) instruction units (frames). The Stack itself both manages and sends data to other registers, often using software threads (pulling from the Operating System's heap allocator).
        heap (dynamic memory allocator): a large and slow resizable-storage that can randomly cut chunks of memory blocks. These chunks leave gaps, causing fragmentation (slowing from searching for another cut of memory). Used heap chunks must also be set free for reuse (garbage collection). The programmer either manually codes it (in fast programming languages like C/C++) or relies on auto GC software (in slower languages like JavaScript, Java, and Python). NOTE: this has nothing to do with a tree-like, heap data structure.
      ☑⁡⁡ Power management starts (electricity is sent to various parts of the system from the power supply).
  • ▼ POST (Power-On Self-Test):

    • ☑⁡⁡ Tests hardware components to determine the system's boot fitness.
      ☑⁡⁡ Feedback from beep code sounds (like a single short for pass) or LED light blinks and colors.
  • Boot (Bootstrap Load):

    ▼ Boot Loader (Bootstrap Loader program):

    • ☑⁡⁡ On successful POST, the system 'bootstraps' or pulls itself together.
        ◆ The ROM (Read-Only Memory) device and the Operating System loading data are accessed (like a pointer to the boot sector on a hard drive).
        pointer: a type of memory address storage. Like reaching through a portal, it allows movement and manipulation of the data there. When the language lacks pointer support, it can be simulated using indexed arrays or lists (where only the index is accessed and used). NOTE: delete or dereference when finished to prevent errors.
        ◆ The core of the OS (the kernel) is loaded and starts itself.
  • Operating System (OS):

    ▼ Device drivers (hardware instructions):

    • ☑⁡⁡ Various drivers are loaded into a combo of kernel mode (kernel space) and user mode (user space).
      ☑⁡⁡ Kernel mode: better performing, higher-access, and privileged processes (located on ring 0 or the deepest center in the protection ring).
      ☑⁡⁡ User mode: protected and less privileged access processes, trading stability for the kernel's speed (drivers are typically located at ring 1 or 2 and apps are on ring 3+).
        ◆ NOTE: moving data between modes is very slow and hardware intensive.
      ☑⁡⁡ Virtual Device Drivers: emulated hardware (like DOS and VM ware) have access to device systems such as IRQs (Interrupt Requests allow hardware like a keyboard to interrupt and be processed so there is no delay).
  • ▼ Process management (running instructions):

    • ☑⁡⁡ Process creation (program executes or instances into memory):
        ◆ Process batches (groups of instructions).
        ◆ Process node parent to child spawning (parent processes create or spawn child processes. Like a filesystem parent opening a web browser). Uses a tree-like data structure for display and management.
      ☑⁡⁡ Process termination:
        ◆ Batch halt instructions (stop a group of instructions).
        ◆ Hardware errors (resolve and display as needed).
        ◆ Software errors (resolve and display as needed).
        ◆ User instructions (with scripts or an input device).
        ◆ Process complete.
      ☑⁡⁡ Two-state process model:
        ◆ RUNNING.
        ◆ NOT RUNNING.
      ☑⁡⁡ Three-state process:
        ◆ RUNNING.
        ◆ READY (queued or in a line).
        ◆ BLOCKED (requires event handling or processing to change state).
      ☑⁡⁡ Five-state process (swapping or 'suspending' states into buffers for efficiency):
        ◆ RUNNING.
        ◆ READY (queued).
        ◆ BLOCKED (requires event handling to change state).
        ◆ READY SUSPEND (READY process loaded into a swap buffer).
        ◆ BLOCKED SUSPEND (BLOCKED process loaded into a swap buffer).
  • ▼ Multitasking (multiprogramming):

    • ☑⁡⁡ Parallel processing (multiple, side-by-side processes):
        ◆ Multi-core processors (each core can assign tasks to process).
        ◆ Bridging processors (data bridges act as parallel pipes for processing).
        ◆ Distributed system or nodes (multiple machines divide up the task processing).
        ◆ Specialized bus architecture (like cable and card slots that transfer or bus data).
        ◆ Pipes optimized for parallel data flow (tasks).
      ☑ Pipes or units optimized for parallel micro-instructions.
        ◆ Interleaving processes (rapid swapping).
  • ▼ Interrupt Requests (IRQs or 'Interrupts'):

    • ☑ The system that manages and interleaves device processes with system processes.
      IRQ lines send messages from hardware to the CPU (Central Processing Unit) to 'interrupt' or take priority.
        Interrupt handler: the program that loads the interrupt and processes it.
        ◆ Used for cabled devices (like USB keyboard, mouse, or hard drive) and internal cards (like video, sound, or networking).
        IRQs are numbered and can be set in the BIOS to avoid conflicts between devices.
        ◆ Newer devices may have dedicated interrupt controllers or support IRQ sharing.
        Plug and play: the system of auto-configuring IRQs when devices connect (devices may need to be reconnected, removed, or the system restarted to trigger).
  • ▼ Filesystem (file system):

    • ☑⁡⁡ Files: units of data in storage, often displaying a name and icon (image) to the user.
        ◆ File format metadata (extra data to read the file) is stored in the header (top of the file).
      ☑⁡⁡ Provides a directory (folder) structure:
        ◆ Uses tables and nodes for look-up.
        ◆ Example look-up tables include NTFS and FAT (Windows), HFS and HPFS (Apple), EXT, XFS (Linux).
        Metadata: extra data for tagging and managing files (like attributes on Windows).
        Sorts files (like by name, size, and date): introsort (a quicksort that as it slows, goes into a heapsort) is often the fastest sort, but it is not stable (destroying the order of cumulative sorts). This means that the standard for files is Timsort (a natural-order pre-sort and a merge sort, plus, if needed, a selection sort).
        Scripting engine: a code compiler that loads instructions by web page load or faster at the cost of slower performance (such as Just-In-Time or JIT or Ahead-of-Time AOT compilers). Like Chrome's JIT JavaScript engine is called V8 and their Array.prototype.sort() will use C++'s std:sort (introsort) for numbers and Timsort for Strings.
      ☑⁡⁡ User Interface (UI) provides tools for manipulating files:
        ◆ Text-based: uses typed commands with typing devices (like keyboards or touchscreen keyboards). Includes Command Line Interfaces (CLIs) used to manipulate the OS and other software (like Powershell's Batch commands on Windows or Bash commands on Linux and Mac OS).
        Graphical User Interface (GUI "goo-ee"): uses a pointer device (like a mouse, stylus, or touchscreen).
  • ▼ Memory Management (moving & recycling)

    • ☑⁡⁡ Virtual Memory: page swapping ROM (like a hard drive) with RAM to increase the available resources for processing.
      ☑⁡⁡ OS memory: is pre-cached and makes the OS and UI more responsive.
      ☑⁡⁡ ⁡Application memory: garbage collection exited processes (used-memory recycling and deleting).
        ◆ Improves application performance if the OS has plenty.
  • ▼ Input/Output devices:

    • ☑⁡⁡ Memory-mapped I/O (MMIO) interfaces and abstractions for the bus and IRQs:
        ◆ Channel I/O: software-based DMA (Direct Memory Access to RAM) with more options.
        Port-mapped I/O (PMIO): port number assignment and management for networked devices (like WiFi hotspots and modems connecting to the web or hubs and routers connecting to local devices).
  • ▼ Networking (linking devices for data communication):

    • ☑⁡⁡ Supports connections that are cable (like Cat 5) and wireless (like WiFi).
      ☑⁡⁡ Providing a visual interface for networking hardware.
      ☑⁡⁡ Software-based networking hardware:
        ◆ Topology: network shapes such as star which has central servers (specialized, data-sending computers) like in cloud computing. Or a mesh shape (woven fabric) in which many connected devices also act as servers (like cryptocurrency mining and trading virtual coins or torrent data sharing).
        ◆ Routing: data trafficking (directing).
        ◆ Packet management: packing and inspection of data packets (data containers for transfer).
        Port mapping (port forwarding): opening or closing data ports (doors).
        Protocol configuration: set the system for data communication (like TCP/IP, UDP, or HTTPS).
  • ▼ Security:

    • ☑⁡⁡ Integrity (data stability):
        Vulnerability patches (vulnerability fixes): updates that fix known exploits (security failures).
        Backups (data backups): using multiples storages to hold multiples copies. Making a copy can be done as you go with RAID (Redundant Array of Independent Drives) connected storage or by software imaging data (fully copying).
        Sandboxing: creating a 'sandbox' or limited space for software to safely run and test (like Sandboxie on Windows). Resource-heavy Virtual Machines can simulate an entire OS and hardware (like VMware Player for OS X on Windows). Lighter options with less features include containers, such as those from Docker (or its UI focused derivative Kubernetes).
      ☑⁡⁡ Integrated tools:
        Malware (harmful software) removal.
        Firewall (port blocker) and net traffic monitoring and control.
        ◆ Network encryption: masking data for transfer.
      ☑⁡⁡ Filesystem:
        ◆ Local encryption: masking data files to prevent unauthorized use.
        ◆ User or privilege metadata: extra data on files that prevent access and use.
        ◆ Typical user classes ranked by privilege are: Root (the original user), Admin (Administrator), Mod (Moderator), Worker, User (Power user), Guest, and Anon (Anonymous user).
      ☑⁡⁡ UI for software tools:
      ☑⁡⁡ UI for hardware tools:
        ◆ Like Fortinet's hardware-based firewall.
  • ▼ Object-oriented Operating Systems (OO OSes):

    • ☑⁡⁡ Uses Object-Oriented Programming (OOP) and The 4 Principles of OOP (I APE 4-OOP) for system architecture:
        1. Inheritance: 'child' objects 'inherit' (import) specific data from 'parent' classes (templates they can expand or extend). Multiple Inheritance (often unsupported) occurs when the 'child' object 'inherits' from more than one 'parent' class (in a single 'generation').
        2. Abstraction: objects implement (import) any data from class-like interfaces (abstract classes). This is powerful because it does not alter inheritance imports from classes. NOTE: interfaces can not make objects unlike normal classes.
        3. Polymorphism ('many changes'): when an object implements (imports) data from an interface of different types (like an Entity Component System or ECS pattern used in UIs and games to separate processing from data).
        4. Encapsulation: hiding access to data within an object by labeling it private (only for the object owner), protected (only for the same class or sub-classed objects), or public (no limits). These labels are known as access modifiers.
        OOP concepts are to increase human readability, not performance (often introducing more overhead). It follows a noun to sub-noun and sub-verb pattern (like thing.stuff or thing.move()).
        A. Like a Manager class extends (or imports) from a broader 'parent' Worker class, giving it access to private resources data. Later we want to give our Manager the ability to do math (without changing the classes) so we implement (import) an add function from a Math interface (abstract class).
        B. When the program is run in memory the Manager class creates (instances) a new manager object (like 'manager = new Manager();'). Then the manager object uses the add() function from Math, adding 5 to their resources and saving it (like 'manager.add(manager.resources, 5);').
        C. What happens if another new manager object (manager2) attempts to access the resources of the older manager object and add them to their own (like 'manager2.add(manager2.resources, manager.resources);')? It fails, causing an error because resources has a private access modifier (data label).
      ☑⁡⁡ Starts with the OSes: Lisp and Smalltalk, but inspired NeXTSTEP which was bought by Apple (yet only the object-oriented GUI layer made it into Mac OS X).
      ☑⁡⁡ Microsoft Windows (from the original NT, to the Phone OS, and Xbox gaming consoles): offers object-based features (no inheritance). This includes an Object Manager, access control lists for security, and various system objects accessible by programs and drivers.
        ◆ Microsoft hires Dave Cutler and the engineers from the object-oriented BeOS after it died in the 1990's.
        OLE (Object Linking and Embedding): using objects to link files and their embedded sub-files. OLE 2 is built on COM.
        COM (Component Object Model): a binary interface for implementing objects across the Microsoft platform. Features RPC (Remote Procedure Calls which is a standard for software run on networks), automation (exposing to scripting and apps), registry database (object storage table), and activation (creating system objects which can start apps).
        CFBF (Compound File Binary Format): store many file formats and streaming data in a single file with COM Structured Storage (OLE 2 compound documents). Uses a variety of sectors for FAT, FAT32, DIFAT (chaining other FATs), MiniFAT (mini-streams), Streams (lines of bytes), directory (folders), and byte range-locking (data access security).
  • CPU (Central Processing Units):

    ▼ CPU concurrency (synching or working in time):

    • ☑⁡⁡ Single Core and non-distributed processing:
        ◆⁡⁡ Programmed threads and bus architecture divide up the work.
        ◆⁡⁡ bridges (bus bridges): hardware connecting multiple buses.
        ◆⁡⁡ Absent such hardware, the CPU can interleave a schedule of instructions to share and manage resources.
        ◆⁡⁡ Thread states & lifecycle:
          #1 new thread -> runnable (new to ready).
          #2+ runnable -> <- blocked (back and forth).
          #2+ runnable -> <- waiting (back and forth).
          #2+ runnable -> <- timed waiting (back and forth).
          #2+ runnable -> terminated (thread ends).
      ☑⁡⁡ Multi-core and distributed systems:
        ◆⁡⁡ Threads and servers divide up the work per node or core.
        ◆⁡⁡ Race condition: when multiple threads 'race' to access the same data (which can cause unexpected values or behavior).
        ◆⁡⁡ Mutexes ("mew-texes" or mutual exclusions): for locking thread data. A thread must call unlock() and edit (or mutate) the data before lock() to exclude other threads from causing a race condition.
        ◆⁡⁡ Semaphores (an alert mutex): tools for synchronizing or timing thread use. Uses functions like wait() (thread is locked and in use), post() (thread is released for use), or init()/open() (thread is setup or open for use). From waving flags to direct traffic or a device that changes color to do the same (Greek for 'sign bearer').
        ◆⁡⁡ Deadlock (endless wait loop): when threads are waiting in an endless loop because they are blocking data from each other. This data can not be locked() or unlocked() for use, so it is considered 'deadlocked'.
      ☑⁡⁡ Overclocking Processors (CPUs and GPUs) by setting the clock speed higher:
        ◆⁡⁡ Can cause thread concurrency (accurate timing) to drop, leading to more errors and worse performance.
        ◆⁡⁡ Also causes OS freezes or crashes (like a Windows Blue Screen of Death or BSoD, which requires system restart).
        ◆ ⁡Other issues include heat damage, screen flickers, 'random' behavior, and errors.
        ◆⁡⁡ Processor overclocking settings and performance depends on the specific model and hardware modding abilities (like increasing cooling).
  • ▼ Assembly Language & Machine Code:

    • ☑⁡⁡ Machine Code (MC or Machine Language): a low level language consisting of binary (so 154 as 10011010) or hex (154 as 9A).
        ◆ Higher level languages compile down into this.
        ◆ Generally this is the level just above vendor-based hardware instructions or microcode.
      ☑⁡⁡ Assembly language (asm): assembles machine code.
        ◆ Uses opcodes (like AAA for addition) to manage and interface with hardware instructions.
        ◆ Ideal for specialized hardware tasks that removes much of the cruft and safeguards.
        ◆ Generally the lowest level a human programs outside of hacks and analog methods.
  • GPU (Graphics Processing Units):

    ▼ GPUs & AI:

    • ☑⁡⁡ Offloading intensive processing required in Machine Learning (solution training for AI) to more specialized hardware (GPUs).
      ☑⁡⁡ APIs (App Programming Interfaces like Vulkan, Direct3D, CUDA) efficiently and cost-effectively link together GPUs.
      ☑⁡⁡ GPUs typically have faster RAM (known as VRAM or Video RAM).
        ◆ Nvidia in particular has Tensor Cores ↗ that operate as AI hardware first and foremost.
        ◆ Nvidia DGX ↗ model workstations and servers comprise multiple cards that specialize in AI processing.
        ◆ Nvidia Jetson ↗ is a specialized and relatively cheap palm-sized computer for AI training.
        ◆ Nvidia NGC ↗ now has cloud-based servers for GPU AI training.
        ◆ Nvidia NGX ↗ features hybrid cloud technology.
  • Cloud Computing:

    ▼ Origin:

    • ☑⁡⁡ IBM offered their Remote Job Entry system in the 1960's, using the OS/360 (Operating System) on a mainframe (large computer) and terminals (smaller computers that use mainframes).
    • ☑⁡⁡ Cloud symbols have been in use since 1977 to refer to networks.
    • ☑⁡⁡ VPNs (virtual private networks): provide multiple layers of protocols (tunneling) and encryption (data masking) to send data through a point-to-point topology (2-way network). It became a popular for businesses to serve private or internal data in the 1990's and a regular service for later cloud providers.
    • ☑⁡⁡ The first commercial cloud computing platform came from Apple's General Magic using web tablets with mini-touchscreens and a stylus (like the Motorola Envoy in 1995). They provided streaming video, plus online games and shopping on the Magic Cap Operating System.

    ▼ PaaS (Platform as a Service):

    • ☑⁡⁡ In 2006 Amazon formed Amazon Web Services (AWS) and releases Elastic Compute Cloud (EC2), allowing customers to host a variety of online apps (SaaS or Software as a Service). This now includes offloading processing for Machine Learning (software solution training) and Artificial Intelligence.
        AWS ecosystem.
      ☑⁡⁡ Google's App Engine (GAE) entered beta in 2008 offering similar services.
        GCP (Google Cloud Platform) ecosystem.
      ☑⁡⁡ Other PaaS options now include Amazon's Lighsail (EC2), Heroku (EC2), Netlify (EC2), and GitHub (Microsoft Azure).
  • ▼ IaaS (Infrastructure as a Service):

    • ☑⁡ Goes deeper and provides more, allowing users to build their own PaaS.
      ☑⁡⁡ Amazon Web Services (AWS):
        ◆ Developed to run Amazon (store), Prime Video (streaming), Alexa (smart appliance), and more.
        ◆ The most popular and expanding business.
        ◆ The most services in and out of development.
      ☑⁡⁡ Google Cloud Platform (GCP):
        ◆ Developed to run Youtube (streaming), Google (search), Gmail (email), and more.
        ◆ Overall slightly faster than AWS, but with less services.
        ◆ Simple and more engineer-like terminology.
      ☑⁡⁡ Microsoft's Azure has mature on-premise and cloud hybrid technology (physical servers on your property that connect to Azure):
        ◆ Developed to run Office 360 (productivity), Outlook.com (email), X-Box Live (games), and more.
        ◆ More market share than GCP.
      ☑⁡⁡ Other IaaS options now include Alibaba Cloud, Oracle Cloud, and IBM Cloud.

  •        : NEWS