-
Notifications
You must be signed in to change notification settings - Fork 31
Enable PMP for memory isolation #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d2552a5 to
319ba96
Compare
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use unified "flexpage" notation.
e264a35 to
4a62d5b
Compare
Got it! Thanks for the correction and the L4 X.2 reference. |
109259d to
f6c3912
Compare
2644558 to
1bb5fcf
Compare
904e972 to
ed800fc
Compare
This comment was marked as outdated.
This comment was marked as outdated.
0d55f21 to
865a5d6
Compare
Finished. And I removed the M-mode fault-handling commits, as they are not aligned with the upcoming work. |
865a5d6 to
7e3992e
Compare
aeda1b2 to
c78a3f3
Compare
The branch is now reduced to 8 commits (without PR #62) while preserving atomic changes. |
49e386e to
adda2e2
Compare
f1422f7 to
5fa0716
Compare
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use 'git rebase -i` to refine commits. Fewer and consolidated ones.
Establish the complete PMP infrastructure to support hardware-enforced memory isolation. This implementation spans the driver layer, hardware abstraction, and region lifecycle management. Introduce `flexpages` to represent contiguous physical regions with protection attributes, and `memory_spaces` to group them into per-task protection domains. These abstractions decouple high-level memory policies from low-level hardware constraints. Address the RISC-V immediate value limitation for CSR instructions by implementing a runtime-indexed access mechanism. A switch-case dispatch system maps dynamic indices to static `pmpcfg` and `pmpaddr` instructions, enabling iterative configuration of PMP entries. All regions use TOR (Top-of-Range) mode to support arbitrary address alignment. Implement a driver stack that maintains a centralized shadow configuration to mirror hardware state. This serves as the single source of truth for PMP operations, supporting atomic updates and dynamic region allocation. The region management API handles the validation, locking, and eviction of entries, while kernel memory pools are automatically secured at boot using linker symbols (Text: RX, Data: RW).
Memory protection requires dynamic reconfiguration when switching between tasks. Each task receives a dedicated memory space with its stack registered as a protected flexpage. During context switches, the scheduler evicts the outgoing task's regions from hardware slots and loads the incoming task's regions, while kernel regions remain locked across all transitions. Kernel text, data, and BSS regions are configured at boot and protected from eviction. User mode tasks operate in isolated memory domains where they cannot access kernel memory or other tasks' stacks. Nested trap handling is required for correct U-mode operation. When a user mode syscall triggers a yield, the resulting nested trap must not corrupt the outer trap's context. Trap nesting depth tracking ensures only the outermost trap performs context switch restoration, and yield from trap context invokes the scheduler directly without additional trap nesting. A test validates mixed-privilege context switching by spawning M-mode and U-mode tasks that continuously yield, verifying correct operation across privilege boundaries.
Hardware PMP entries are limited resources, restricting the number of simultaneous memory mappings. Additionally, unrecoverable access faults currently panic the kernel, which compromises system availability during multitasking. Address the hardware implementation limit by loading memory regions dynamically. When a task accesses a valid region not present in hardware, the fault handler evicts a lower-priority entry to map the required flexpage on demand. This decouples the number of task memory regions from the physical PMP slot count. Improve system stability by terminating tasks that trigger unrecoverable faults. Instead of halting the entire system, the fault handler marks the task as a zombie and signals the scheduler to maximize resource reclamation. This ensures that isolation violations affect only the faulting process.
Provides comprehensive documentation covering memory abstraction, context switching, and fault handling for the PMP implementation. The test application validates memory isolation through four tests. Test 1 spawns three U-mode tasks that verify stack integrity using magic values across context switches. Test 2a attempts to write to kernel .text from U-mode, triggering task termination. Test 2b attempts to read another task's exported stack address, validating inter-task isolation. Test 3 spawns tasks exceeding the hardware PMP region limit to validate the eviction policy and lazy loading mechanism during context switches. CI scripts are modified to recognize expected output results. Additionally, synchronization timing in app/mutex.c is adjusted to accommodate the increased context switch overhead introduced by PMP configuration.
5fa0716 to
337fdb6
Compare
Now the commits is reduced to 4 commits. |
| /* Reject overflow to prevent security bypass */ | ||
| if (size > 0 && addr > UINT32_MAX - size) | ||
| return 0; | ||
|
|
||
| uint32_t access_end = addr + size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check is correct but the condition size > 0 allows size == 0 to bypass overflow detection. If size == 0 and addr == UINT32_MAX, access_end becomes UINT32_MAX which may pass region checks unexpectedly. Not a security bypass per se, but semantically unclear.
Fix: Remove the size > 0 guard or document the intended behavior for zero-size accesses.
| fpage_t *victim = select_victim_fpage(mspace); | ||
| if (!victim) | ||
| return PMP_FAULT_UNHANDLED; | ||
|
|
||
| uint8_t victim_region = victim->pmp_id; | ||
| int32_t ret = pmp_evict_fpage(victim); | ||
| return (ret == 0) ? pmp_load_fpage(target_fpage, victim_region) : ret; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pmp_evict_fpage() sets victim->pmp_id = PMP_INVALID_REGION. Use-after-invalidation in fault handler.
|
|
||
| PMP provides 16 hardware slots shared between kernel and user regions. | ||
| Kernel regions occupy slots 0-2 and cannot be evicted. | ||
| Each user region requires two slots (paired entries for TOR mode). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOR mode consumes 2 slots per user region. This document describes this, but with only 16 PMP slots total:
- 3 kernel regions (text/data/bss)
- 13 remaining → 6 user regions max
With the test spawning 9 U-mode tasks, you're relying heavily on lazy loading. This is fine architecturally but creates runtime overhead. Consider documenting this constraint more prominently.
| */ | ||
|
|
||
| /* Read PMP configuration register by index (0-3) */ | ||
| static uint32_t __attribute__((unused)) read_pmpcfg(uint8_t idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is used.
jserv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing newline at EOF in pmp-memory-protection.md.
This PR implements PMP (Physical Memory Protection) support for RISC-V to enable hardware-enforced memory isolation in Linmo, addressing #30.
Currently Phase 1 (infrastructure) is complete. This branch will continue development through the remaining phases. Phase 1 adds the foundational structures and declarations: PMP hardware layer in arch/riscv with CSR definitions and region management structures, architecture-independent memory abstractions (flex pages, address spaces, memory pools), kernel memory pool declarations from linker symbols, and TCB extension for address space linkage.
The actual PMP operations including region configuration, CSR manipulation, and context switching integration are not yet implemented.
TOR mode is used for its flexibility with arbitrary address ranges without alignment constraints, simplifying region management for task stacks of varying sizes. Priority-based eviction allows the system to manage competing demands when the 16 hardware regions are exhausted, ensuring critical kernel and stack regions remain protected while allowing temporary mappings to be reclaimed as needed.
Summary by cubic
Enables RISC-V PMP for hardware memory isolation (#30). Uses TOR mode with boot-time kernel protection, trap-time flexpage loading, per-task context switching, and U-mode kernel stack isolation via mscratch; unrecoverable access faults terminate the task instead of panicking.
Written for commit 337fdb6. Summary will update on new commits.