This post will detail the design of the Virtual Memory Management subsystem of the LμKOS kernel. You’ll note that this post follows along with the virtual memory manager design doc. That said, let’s begin!

First, Some Definitions

I’ll be using several terms and acronym in this post that are not necessarily ubiquitous, and want to make sure I define them so it’s easy to understand.

TermAcronymDefinition
Virtual AddressVAAn address which represents a resource in a computing system. This VA may or may not correspond to something physical (like RAM).
Physical AddressPAAn address which maps to a physical resource in the computing system, like RAM or peripherals.
Address SpaceASA representation of a range of VAs starting with zero (0) and running to some architecture dependent upper bound. ASes may have other meta data associated as well, such as a unique identifier.
RegionA mapping of a subrange of VAs to a subrange of PAs with a particular set of attributes.
Virtual Memory ManagerVMMThe subsystem responsible for managing the ASes which correspond to the processes in the system.
Memory Management UnitMMUA hardware device in a computing system which can be programmed to translate VAs to PAs.
Table 1. Definitions.

Why do we need a Virtual Memory Manager?

A good OS needs a virtual memory management capability for a few key reasons.

Keep kernelspace and userspace separate

We want to separate the kernel code and user code so that one does not interfere with the other. Using virtual memory via an MMU allows us to run the kernel in one VA range, and all userspace code in another. This improves kernel reliability because the kernel won’t crash if a userspace process goes belly up. It also improves reliability for other userspace processes, as they won’t be affected if an adjacent process dies. Separation of the the two spaces is also considered good cyber hygiene. Lastly, any modern OS, even many RTOSes support this feature because it is widely understood to be advantageous for development and deployment to have kernel code and user code “live” in different address ranges.

Keep userspace applications separate

As mentioned in the previous paragraph, running each process in its own virtual memory space protects it from the world and the world from it. This is a huge bonus for reliability and security.

Provide a common application layout

This facet is often underappreciated, but if your processes all think they live in whole of “userspace”, then all applications can be linke to that set of addresses. This makes application portability a cinch, and it makes the job of the application loader easier as well.

The LμKOS Virtual Memory Manager

The LμKOS Virtual Memory Manager (which I’ll refer to as the VMM from here on out) provides a framework to achieve the aforementioned goals. The VMM is comprised of only five (yes 5!) functions for the kernel interface. Their prototypes are listed below.


/**
 * initialize the virtual memory manager
 */
void vmm_init(void);

/**
 * create an empty address space
 * @return the new address space
 */
address_space_t *vmm_address_space_create(void);

/**
 * create a region in this address space of size (len) and properties (prop) and return the resulting VA in that address space
 * use this when we don't care where the VA is going to be
 * @param as		the address space
 * @param len		the len in bytes
 * @param prop		the props
 * @param vadest	the returned VA
 * @return 0 on OK != 0 on failure
 */
 
int vmm_address_space_region_create_auto(address_space_t *as, size_t len, address_space_region_prop_t prop, void **vadest);

/**
 * create a region in this address space of size (len) and properties (prop) at the specified VA
 * use this when we DO care where the VA is going to be
 * @param as		the address space
 * @param len		the len in bytes
 * @param vadest	where the base of the regions should be
 * @param prop		the props
 * @return 0 on OK != 0 on failure
 */
int vmm_address_space_region_create(address_space_t *as, void *vadest, size_t len, address_space_region_prop_t prop); 
 

/**
 * copy some data from kernel space into this AS
 * @param as			the address space
 * @param vakernel		the VA in kernel space
 * @param vadest		the VA in the AS space
 * @param len			the number of bytes
 */
void vmm_address_space_copy_in(address_space_t *as, void *vakernel, void *vadest, size_t len);

The vmm_init Function

This function is as straightforward as it sounds. The startup code calls this once to perform any one-time initialization needs. At present, the definition is blank, but it’s there just in case a need arises in a future update.

The vmm_address_space_create Function

This function creates an Address Space (AS) and returns a pointer to it. The pointer is to an address_space_t type which is defined as:

typedef struct
{
	void *arch_context; /**< this is the architecture specific context, the kernel is agnostic to this */
	size_t id;			/**< the uuid for this address space */
	size_t status;		

}address_space_t;

The arch_context field is simply a pointer to be filled in by a call to the architecture specific code providing MMU capability to the kernel. This context “cookie” is passed to the architecture specific code whenever an operation needs to be done on a specific Address Space. The id is simply a unique identifier for the AS. The status is currently unused, although it may be used in the future to mark dead, inactive, or faulted ASes.

The vmm_address_space_region_create_auto Function

This function is called by the kernel when we need to allocate a Region in an Address Space, but don’t care where the resulting Virtual Address ends up. This is useful in situations when we want to allocate some RAM for a process to use as heap memory. User processes in LμKOS will utilize this function indirectly via the syscall_memory_alloc system call. We supply a parameter describing how the region should be setup in this address space, such as read-only, or read-execute.

The vmm_address_space_region_create Function

This function performs the same operations as vmm_address_space_region_create_auto, but we use this variant when we care where the resulting Virtual Address so the OS can control where the RX executable (.text), RW zero-initialized (.bss), and RW initialized (.data) sections begin and end. Similarly to vmm_address_space_region_create_auto, it accepts an argument determining the properties of the regions in the address space.

Side Note on Region Creation

Because we can specify the properties of each Region for each Address Space, it is possible for us to create mappings in different Address Spaces to the same Physical Addresses with different permissions. This allows us to share memory in a controlled way. Consider this: one thread in AS #1 could have read-write privilege to a range of PAs, and the same range of PAs could be mapped as read-only in another thread of execution using AS #2.

The vmm_address_space_copy_in Function

This function provides a deceptively complex, yet seemingly simple service. At its core, we’re just copying data from VAs in one Address Space (the kernel’s) to VAs in another Address Space. The problem arises in that memory is not necessarily contiguous from one page to another when mapped in the kernel context vs. the user context. In other words, memory fragmentation can cause some interesting side effects. I’ll use a couple diagrams to illustrate below.

Figure 1. Kernel View of Identity Mapped Range

In Figure 1, the kernel has a 1GiB page which identity maps the first gibibyte so that any contiguous addresses within that Virtual Address range 0x00000000-0x40000000 are exactly the same in the exact same Physical Address range. If this were the only view of the system, you could safely use memcpy to move data around, and when moving data around kernel context you can do this. When moving between kernel and userspace (and vice-versa) things get a little more complicated. See Figure 2.

Figure 2. User View of NON-Identity Mapped Range

Figure 2 show how a contiguous range of VAs can possibly be mapped to a incontguous range of PAs. If we wanted to copy 16kiB of data from the user’s Address Space starting at VA 0x0020000 into kernel space, the first step is getting the PA of the data represented by the VA 0x0020000 in this Address Space. In LμKOS we make a call to vmm_arch_v2p, which is virtual-to-physical address translation. This gives us the PA where the data lives in RAM which was represented by the VA given. Even with this information we can’t use memcpy! Why? Because while the first page (4kiB) of data would be correct, the next 4kiB is mapped at a different PA!

In short, our copy-in and copy-out from kernel space has to be done very carefully, needs to be page boundary aware, and needs to handle this gracefully.

Architecture Specific Code

Intrinsically, virtual memory management depends on the MMU available, so there is a dependence on architecture specific functions as well. These are listed below.

/**
 * create the architecture specific context (transtables, etc) and return a pointer in kernel VA space to it
 */
void *vmm_arch_context_create(address_space_t *as);

/**
 * get a range of of virtual address space in the given context
 * that will accomodate the given size
 */
void *vmm_arch_get_free_va_range(void *context, size_t len);

/* 
 * allocate and map some RAM starting with the given VA and going for len bytes
 * @param ctx		the arch context
 * @param props		the regions properties
 * @param va		the starting VA
 * @param len		the number of bytes to alloc and map
 * @return non-zero if problem
 */
int vmm_arch_alloc_map(void*ctx, address_space_region_prop_t props, void *va, size_t len);

/**
 * map a specific VA to a specific PA of len
 * @return non-zero if problem
 */
int vmm_arch_map(void *ctx, address_space_region_prop_t props,  void *va, void *pa, size_t len);


/** 
 * check the alignment of the section based on length
 * @param va		the VA to check
 * @param len		the segment size
 * @return 0 if OK, non-zero otherwise
 */
int vmm_arch_align_check(void *va, size_t len);
/**
 * get the PA associated with this VA
 */
void *vmm_arch_v2p(void *ctx, void *va);

These vmm_arch_* functions perform the essential translations pieces that the kernel-level manager cannot, since it doesn’t know about each MMUs specificity. The implementing code for these calls lives down in /implementation/src/arch/ in GitHub repo, go check that out.

Wrap-Up

The VMM in LμKOS contains the bare-minimum functionality to manage multiple address spaces from a single interface, allowing the OS to effectively abstract this management away from the management of other parts of a multi-process system.