Date: April 21, 2025

Topic: LRVM

Recall

To make persistence efficient, we use logsegs which allows for contiguous writing to maintain data structs on the VM and disk, and also reduces the number of I/O operations to the disk.

Notes

Persistence

Why: OS subsystems need persistence
How: Done by making virtual memory persistent
- All data structs in virtual mem are then persistent
- Subsystems no need to worry about flushing changes to disk
- Recovering from crashes are easy since data structs are persistent
Who uses it: Subsystem designers if it is performant
How to make it efficient: Use persistent logs to record changes to VM
- Whenever we make changes to the data structs in the VM, the on-disk representations have to be updated
- Data structs are spread all over the VA space, resulting in many I/O ops which is inefficient (seek and rotational latency in the disk)

Making Persistence Efficient

Use logsegs, when changes are made to persistent data structs in virt. mem., write to a logseg the changes
Can contiguously write to logseg and store the changes onto a disk
We make random writes in the disk that requires persistent data structs but do so in a sequential way
- Since these changes are recorded as logs in log segments and we commit these to disk, we can convert random writes (inefficient) to sequential writes (efficient)
- No more individual I/O operations into disk, reducing number of I/O ops and contiguous writing solves random writes problem
These optimizations make persistence more efficient

LRVM Purpose

Has a persistent memory layer in support of system services
Used for recovery management, log force that happens at commit point is a synchronous I/O since the app has to wait for I/O to complete before continuation
Due to synchronous I/O, it is still considered heavyweight even though semantics are considerably reduced and made simpler

Apps designate collections of persistent data structs called data segments (like on-disk inodes) and map them into selected regions of their virtual address space so any in-memory updates are reflected back to disk.

At startup, an app can map these segments, and unmapping can be done when no commits are pending.

Server Design

Not the entire address space of the server has to be persistent
The subsystem designer knows what data structs in his design need to be persistent
E.g., for File System design, inode which are living on disk are the things that the designer wants to be persistent
- If an inode data struct ($M_1$) is mapped into a portion of the VA space, updates to $M_1$ has to be reflected into the backing store
$M_1$ and $M_2$ in the VA space of server are in-memory versions of persistent data structs living on the disk
This collection of data structs that need to be persistent are a data segment
App designer may choose to use single or multiple data segments to correspond to persistent objects during execution
- An app needs to be able to map an external data segment to selected portions of its address space
- We help the application by explicitly mapping the regions of VA space to data segments living on the disk
- Hence, the apps manage their own persistence needs, we only provide ability to specify external data segments to back persistent data structs
At startup, the app maps external data segments to selected portions of the VA space to create in-memory versions of data structs it needs to manipulate during execution
- App can also choose to unmap portions of the VA space that is currently mapped
- Possible to do so when no commits are pending for the in-memory data structures and external data structs
Mapping between virt. mem. space and external data segment is one-to-one
- No overlap of external data segments in terms of occupancy within the VA space, making the design of reliable virt. mem. much simpler

At the start, we need to identify the logseg for the server process, so that persistent data structs can be maintained.

We also map the region of the VA space to external data segments (or unmap them later if needed).

<aside> 📌 SUMMARY: Lightweight Reliable Virtual Memory (LRVM) uses persistent log segments to make in-memory modifications into sequential redo logs for efficient sequential disc operations. Subsystems map named data segments (like inodes) into regions of the VA space. LRVM uses simple primitives initialize, begin_xact, set_range, end_xact and abort_xact to record only necessary changes to the stated memory ranges. At commit, redo logs are flushed to disk (if not in no_flush mode) and undo logs are discarded (or restored for abort). Log truncation and recovery tasks can then apply the logs to persistent storage, thus reclaiming space. These log transactions are also simple and lightweight, without full ACID properties and can be done in parallel with the app’s forward progress.

</aside>