- Llambduh's Newsletter
- Posts
- Unboxing VirtualBox: Virtually Everything on Hypervisors
Unboxing VirtualBox: Virtually Everything on Hypervisors
Llambduh's Newsletter Issue #10 - 03/25/2025

Today, we're unboxing VirtualBox, a powerful piece of software that unlocks the potential of virtualization. This article will delve into the world of hypervisors, using VirtualBox as our prime example to explore their core concepts and practical applications
We will begin with the fundamentals of virtualization, then explore the role of hypervisors, focusing on the hosted hypervisor category. Finally, we'll unpack VirtualBox itself, examining its features, architecture, and real world applications.
The World of Virtualization and VMs
Virtualization is a sophisticated abstraction technique that decouples software execution environments from the physical hardware. It’s used in the creation of a virtual versions of physical objects, like servers, operating systems, storage devices, or network resources. It involves creating a logical layer of abstraction that enables the execution of multiple, isolated execution contexts over a single physical platform. At the lowest level, virtualization leverages hardware assistance (e.g., Intel VT-x, AMD-V) to map sensitive instructions and state transitions into controlled environments, managing privilege levels and secure resource partitioning. The abstraction operates at various levels to ensure that each virtual instance perceives access to a fully operational hardware stack, while, in reality, the hypervisor multiplexes underlying physical resources.
Virtual Machines (VMs) are concrete instantiations of this abstraction. They emulate an entire hardware platform, from processor state and memory hierarchy to peripheral devices, thus delivering a complete environment for running guest operating systems. At the architectural level, a VM encapsulates:
Virtual CPU (vCPU): A carefully scheduled subset of physical CPU resources, often with support for hardware assisted isolation through constructs such as Virtual Machine Control Structures (VMCS) on Intel and Virtual Machine Save Areas (VMSA) on AMD.
Virtual Memory: A layered mapping mechanism that translates guest virtual addresses to guest physical addresses and subsequently to host physical addresses via techniques like shadow paging or extended page tables (EPT/NPT), ensuring both isolation and performance.
Virtualized I/O: An emulated or paravirtualized interface that provides guest operating systems with device drivers, replicating functionalities of standard peripherals while abstracting direct physical device access.
VMs provide strong isolation by maintaining stringent boundaries between the guest's software state and the hypervisor’s control environment. Furthermore, they require intricate mechanisms to manage state transitions, interrupts, and exception handling in a manner that is both transparent to the guest OS and efficient for the hypervisor.
Hypervisors: The Backbone of Virtualization
At the heart of virtualization lies the hypervisor, often referred to as virtual machine monitors (VMMs), a software layer that manages and allocates resources to virtual machines (VMs). Hypervisors act as the intermediary between the VMs and the underlying physical hardware.
Two Types of Hypervisors
Type 1 (Bare Metal Hypervisors)
These hypervisors run directly on the host's hardware, providing direct access to system resources. These hypervisors maintain a microkernel like architecture, relying on minimal trusted computing bases to reduce attack surfaces and improve performance. Examples include VMware ESXi and Microsoft Hyper-V. They offer superior performance and are typically used in enterprise level virtualization deployments.
Type 2 (Hosted Hypervisors)
These hypervisors run on top of an existing operating system, like Windows, macOS, or Linux. They abstract physical hardware resources via OS provided APIs, which typically introduces additional layers and overhead but gives them greater ease of use and flexibility, making them ideal for individual users and development environments. Examples include VirtualBox, VMware Workstation, and Parallels Desktop.
Hypervisor Virtualization Methods
Full Virtualization
The hypervisor provides complete simulation of the underlying hardware. Techniques such as binary translation intercept privileged instructions that can’t operate in non root modes. The hypervisor translates sensitive instructions into safe sequences, maintaining the illusion of full hardware control.
Paravirtualization
Instead of intercepting sensitive instructions, the guest OS is modified to use hypercalls (akin to system calls) to interact with the hypervisor directly. This approach reduces overhead as it avoids binary translation and improves performance, particularly with I/O operations.
Hardware Assisted Virtualization
Modern processors include extensions such as Intel VT-x and AMD-V that introduce new CPU modes (typically root and non root modes) to reduce or eliminate the need for binary translation. Hardware support allows direct execution of most guest instructions, triggering a VM exit on sensitive operations. It introduces constructs like the Virtual Machine Control Structure (VMCS) on Intel and Virtual Machine Save Area (VMSA) on AMD, which streamline state save/restore operations during transitions.
CPU Virtualization: Privilege Levels & Transitions
Central to hypervisor operation is the management of the CPU’s privileged instruction sets. Traditional architectures are plagued by the “semantic gap” wherein certain sensitive instructions do not trap when executed outside the conventional privileged ring. Traditional software only hypervisors employed binary translation to detect and manage such instructions. In contrast, hardware assist via VT-x/AMD-V provides a dedicated root mode for the hypervisor. Core technical aspects include:
VM Exits: They occur when a guest’s execution reaches a sensitive instruction, forcing the CPU to switch context into the hypervisor’s root mode. The latency and frequency of VM exits are critical parameters for performance.
VMcs and VMCB: Are data structures that store the guest’s architectural state, control the behavior of certain instructions, and define exit conditions.
Nested Virtualization: Are hypervisors within hypervisors, implementing them necessitates careful management of multiple layers of privilege and state encapsulation to ensure that inner guests are unaware of the outer hypervisor’s interventions.
Memory Virtualization
Memory virtualization is a complex technical challenge involving mapping guest physical addresses to host physical addresses. Two principal approaches have emerged:
Shadow Page Tables
The hypervisor maintains shadow copies of the guest’s page tables in the host address space. Every memory access by the guest is translated using these tables, implicating overhead as guest page table updates require hypervisor intervention to synchronize the shadow copy.
Nested Paging
Modern hardware leverages extended page tables. The guest OS manages its own page tables, while the hypervisor maintains a separate level of indirection that maps guest physical memory to host memory. This hardware assisted approach minimizes costly hypercalls and reduces the need for full shadow page table emulation. The latency of TLB (Translation Lookaside Buffer) flushes and the granularity of page table updates are key design concerns, demanding careful trade offs between memory isolation and performance.
Additionally, techniques like ballooning and memory overcommitment provide mechanisms to reallocate physical memory dynamically among guests, making memory virtualization a critical component in multi tenant environments.
I/O Virtualization
I/O intensive operations in virtualized environments involve simulating a range of peripheral devices and techniques including:
Emulated Devices
The hypisor intercepts I/O instructions using either trapping or paravirtualization and emulates device responses in software. Although flexible, this method introduces significant latency compared to native hardware access.
Paravirtualized Drivers
Modified drivers within guest OSes communicate directly with the hypervisor to offload device related tasks. VirtIO is a widely adopted standard in Linux based hypervisors that minimizes overhead by reducing the number of context switches and data copies.
Direct Device Assignment & SR-IOV
Techniques such as PCI passthrough enable a guest to access physical hardware devices directly, bypassing the hypervisor’s emulation layer. Single Root I/O Virtualization (SR-IOV) allows a physical device to expose multiple virtual functions, each of which can be directly assigned to separate guests. Managing DMA remapping and ensuring interrupt isolation are non trivial challenges in this domain.
CPU Scheduling & Context Switching
Hypervisors must intelligently schedule vCPUs (virtual CPUs) on physical CPUs, they achieve this with a combination of:
Scheduling Policies: The hypervisor must orchestrate the mapping between potentially oversubscribed vCPUs and a finite number of physical CPUs, balancing fairness, real time constraints, and performance isolation.
Context Switching: Both VM exits/entries due to sensitive instructions and preemptive scheduling require efficient state switching.
Core Pinning: In high performance environments, hypervisors may support CPU pinning to reduce cache misses and lower context switch overhead, ensuring predictable scheduling behavior for latency sensitive applications.
Additional Reading
Hypervisors stand at the intersection of hardware design, operating systems theory, and security engineering. This is only a newsletter article mind you so I can’t go into every integral detail of Hypervisors. We didn’t cover securtiy features like Secure Boot and TPM Integrations or security concerns like VM Escapes or Side Channel attacks. We didn’t cover concepts like cache management or speculative buffers. We skpped Lock Free Data Structures and Algorithms. Missed out on information concerning NUMA (Non-Uniform Memory Access) architectures, memory balancing and cache locality strategies, etc.
You can write an entire book on Hypervisors like this issues sponsor Hardware and Software Support for Virtualization!
Hardware and Software Support for Virtualization is an indispensable resource for anyone interested in exploring the transformative power of virtualization in modern computing. It’s a compelling exploration that bridges the rich history of virtualization theory with today's practical innovations in hypervisor design. This book illuminates how modern architectures integrate fundamental principles originally outlined by Popek and Goldberg four decades ago with real world advancements to create powerful, efficient environments on x86-64 and ARM systems. Seamlessly blending historical context and state of the art techniques, it dives deep into CPU, memory, and I/O virtualization and presents fascinating case studies on Linux/KVM, VMware, and Xen hypervisors. Whether you are an academic, systems engineer, or tech enthusiast, prepare to challenge your understanding of virtualization and emerge with a comprehensive perspective on its transformative impact in computing, get your copy today!
Unboxing VirtualBox
Oracle VM VirtualBox is a powerful open source, cross platform virtualization hypervisor. It allows users to run multiple guest operating systems (OS) simultaneously on a single host machine. This final section delves into the technical intricacies of VirtualBox, exploring its architecture, key components, and how it interacts with the underlying hardware.
Install VirtualBox
Architectural Overview
VirtualBox’s modular architecture plays a crucial role in delivering its diverse functionality. Understanding how each component functions is key to grasping the inner workings of the system. The architecture is broadly divided into the following elements
Main Api
The Main Application Programming Interface (API) is the foundation for user interaction. It serves as a unified endpoint for commands issued via the graphical user interface (GUI), command line interface (CLI), or web services. The API abstracts the complexity behind VM operations and interactions, ensuring that higher level applications and script routines can communicate with the core engine predictably.
VirtualBox Manager
Acting as the central control entity, the VirtualBox Manager facilitates tasks such as the creation, deletion, configuration, monitoring, and execution of virtual machines (VMs). Its management capabilities include snapshot handling, device assignments, and dynamic adjustments to VM configurations. This component simplifies the administrative overhead for end-users and administrators.
Virtual Machine Monitor (VMM)
The VMM is the engine that allows guest operating systems to run. By operating in ring 0 (i.e., kernel mode) on the host machine, the VMM directly communicates with the CPU and emulates the required hardware for the guest OS. Its core responsibilities include task scheduling, interrupt handling, and resource allocation. Within this context, the VMM ensures that sensitive instructions from guest code are correctly intercepted and managed, maintaining the delicate balance between performance and system isolation.
Device Emulation
To create an illusion of physical hardware, VirtualBox implements comprehensive device emulation. This component mimics numerous hardware devices, such as virtual hard disks, network adapters, audio controllers, USB devices, and more. The guest operating system interacts with these emulated devices as if they were physical hardware, thus enabling compatibility with a wide range of software without requiring modifications.
Host Operating System interface
This interface is responsible for bridging the gap between VirtualBox’s internal operations and the actual physical hardware managed by the host OS. It transparently handles resource allocation for the CPU, memory, and I/O operations. By managing low level interactions and ensuring proper utilization of physical resources, the host OS interface underpins the performance and reliability of the virtualization stack
Guest Additions
To boost performance and provide integrations between the host and guest environments, VirtualBox offers Guest Additions, a set of drivers and utilities installed within the guest OS. These components enhance the guest’s video performance, provide shared folder support, enable seamless mouse integration, and simplify dynamic screen resizing. The utilization of Guest Additions often means improved responsiveness and a more native operating experience within the virtualized environment.
Virtualization Techniques
VirtualBox employs a rich mix of virtualization techniques to provide speed, isolation, and flexibility including:
Hardware Virtualization (HV)
Modern CPUs from Intel (VT-x) and AMD (AMD-V) include dedicated virtualization extensions that allow guest operating systems to run in isolated hardware environments. VirtualBox leverages these features for near native performance. Essentially, when hardware support is available, guest code can execute directly on the CPU while the hypervisor (the VMM) oversees critical operations to ensure isolation and safe context switching.
Paravirtualization
Even with hardware assisted virtualization, there remains overhead involved in certain guest host communications. Paravirtualization addresses this by offering specialized drivers often installed as part of the Guest Additions that replace inefficient emulated device handling. These paravirtualized drivers optimize kernel to kernel interactions between the guest and host, delivering improved I/O performance and reducing latency in operations.
Software Virtualization
In the absence of hardware virtualization support, VirtualBox resorts to software virtualization. In this mode, the VMM intercepts sensitive instructions issued by the guest OS and emulates their execution in software. While this emulation provides extensive compatibility across a multitude of systems and applications, it naturally results in lower performance compared to hardware-assisted approaches.
Memory Management
VirtualBox’s memory management strategies ensure that each VM receives its required memory while maintaining overall system stability. Some of the key techniques include:
Shadow Page Tables
A fundamental challenge in virtualization is providing the guest OS with the illusion of having full control over physical memory. Shadow page tables map the guest’s physical memory requests to real host memory locations, allowing the guest OS to work as if it were interacting with actual hardware. By maintaining these mappings, VirtualBox ensures efficient translation and isolation between guest and host memory spaces.
Page Table Synchronization
Since the guest OS can frequently alter its internal page tables, efficient synchronization mechanisms are crucial. VirtualBox optimizes performance by synchronizing only the modified parts of the shadow page tables, thereby reducing overhead and helping maintain high execution speed even in dynamic memory usage scenarios.
Nested Paging
Nested paging (also known as extended page tables in Intel VT-x or rapid virtualization indexing in AMD-V) introduces an additional level of translation. This is particularly useful in scenarios involving nested virtualization, whereby a VM running on VirtualBox might itself host a hypervisor to further run guest VMs. The powerful nested paging mechanism provides the necessary translation layers with minimal performance penalties.
Memory Ballooning
Resource allocation between host and guest can be challenging, especially when the host operates under memory pressure. Through a process known as ballooning, VirtualBox can “inflate” a virtual balloon driver within the guest. This driver artificially consumes memory within the guest, marking it as available for the host to reclaim when needed. Ballooning helps balance memory allocation dynamically, ensuring that multiple VMs can coexist efficiently even on resource constrained hosts.
Storage Management
VirtualBox supports a wide range of disk formats to provide flexible and efficient storage management including:
VDI (Virtual Disk Image)
As VirtualBox’s native disk format, VDI offers advanced features like snapshot management and dynamic resizing of storage. This flexibility makes it one of the preferred formats for achieving efficient disk space usage and smooth rollback operations during development or testing.
VMDK (Virtual Machine Disk)
Originally designed for VMware, the VMDK format is widely adopted and allows for cross-compatibility among different virtualization platforms. This is especially advantageous in heterogeneous infrastructures where VMs may need to migrate between different hypervisors.
VHD (Virtual Hard Disk)
Developed by Microsoft, the VHD format is common in environments utilizing Hyper-V or other Microsoft virtualization technologies. VirtualBox’s support for VHD ensures interoperability in mixed technology environments.
HDD (Parallels Hard Disk)
Used primarily in Parallels Desktop, the HDD format is another example of VirtualBox’s versatility. Through support of multiple disk image types, VirtualBox allows users to import, export, and share virtual environments with minimal friction.
Raw Disk Access
For advanced users and performance critical operations, raw disk access allows VirtualBox to interact directly with physical disk partitions or entire drives. Although this provides maximum performance due to the bypassing of the virtual disk layer, it comes with increased risk if the access is not properly managed, as it exposes the host system to additional risks.
Networking
Networking capabilities in VirtualBox are equally robust, offering a variety of modes including:
NAT (Network Address Translation)
In the NAT mode, the guest operates behind the host’s IP address while accessing external networks via a virtual NAT router. This is often the default mode for users, as it requires minimal configuration and provides an additional layer of separation between the guest and external networks.
Bridged Networking
In contrast to NAT, bridged networking connects the guest directly to the physical network. The guest receives its own IP address and appears as an independent device on the network. This mode is particularly useful for scenarios requiring direct network integration, such as hosting servers or testing network configurations.
Internal Networking
For environments where a group of VMs needs to communicate in a contained space, internal networking creates a private network among VMs on a single host. This allows for the establishment of test networks, multi-tier applications, or distributed systems without exposing them to the host’s external network.
Host only Networking
Host only networking strikes a balance between isolation and connectivity by creating a network that is shared solely between the host and its VMs. It is ideal for development scenarios where communication between the host and the guest is required, while keeping the guest isolated from the external world.
Inter-Process Communication (IPC)
Smooth communication between VirtualBox’s internal components is vital for efficient operations. Several IPC mechanisms facilitate this including:
COM (Component Object Model)
COM is instrumental in communication between the Main API and the VirtualBox Manager. It standardizes the transfer of commands and responses, ensuring that user initiated actions are reliably propagated through the system.
Internal APIs
Beyond COM, VirtualBox employs numerous internal APIs to enable messaging and coordination between various components like the VirtualBox Manager and the VMM. These APIs are designed for high performance, ensuring that control signals and data are processed with minimal latency.
VRDP (VirtualBox Remote Desktop Protocol)
VRDP extends VirtualBox’s capabilities by providing remote access to the guest OS’s graphical interface. By exposing a remote desktop interface, administrators and users can interact with VMs even when they are not physically present at the host workstation.
Security Features
Security within a virtualized environment is of paramount importance, and VirtualBox incorporates several layers of protection including:
Isolation
VirtualBox is architected to ensure strong isolation between guest operating systems. This prevents a breach in one VM from affecting another or compromising the host. Isolation is maintained through hardware-assisted virtualization, memory protection mechanisms, and separate execution contexts.
Access Control
Access to virtual machines is tightly controlled via user authentication and role based authorization. By enforcing strong credentials and policy based management, VirtualBox minimizes the risk of unauthorized access and misuse of virtualized resources.
Sandboxing
The hypervisor’s sandboxing feature limits the guest OS’s access to the underlying host resources. This layer of abstraction restricts operations that could potentially damage the host system or lead to unintended privilege escalation, ensuring that even if a guest is compromised, the impact on the host is minimized.
Conclusion
This article has offered a comprehensive exploration into the realm of virtualization and hypervisors, emphasizing the critical role they play in abstracting and efficiently managing physical resources. By examining the foundational concepts behind virtualization and virtual machines, we highlighted how hypervisors, whether bare metal or hosted, serve as the engine that powers modern computing environments. Using VirtualBox as our focal point, we unraveled its robust architecture, showcasing its main API, Virtual Machine Monitor, and device emulation capabilities, while also discussing its ability to harness hardware assisted and paravirtualization techniques to deliver optimal performance. Ultimately, this deep dive not only demystifies the inner workings of VirtualBox but also the core technologies behind Hypervisors.