Skip to content

Boot and Kernel Initialization

This document provides a deep dive into how TilekarOS transitions from the BIOS to a fully operational 32-bit higher-half kernel.

1. The Bootloader Handoff (GRUB)

TilekarOS is Multiboot-compliant, meaning it relies on a bootloader like GRUB to perform the initial hardware setup.

The Multiboot Header

Source File: boot.asm

GRUB looks for a specific magic number (0x1BADB002) within the first 8KB of the kernel binary. This header contains flags that request page alignment for modules and memory information from the bootloader.

Code Preview: boot.asm (Header)
; Multiboot header constants.
MULBOOT_PAGE_ALIGN  equ  1 << 0            ; Request page-aligned modules
MULBOOT_MEMORY_INFO  equ  1 << 1            ; Request memory map from bootloader
MULBOOT_HEADER_FLAGS  equ  MULBOOT_PAGE_ALIGN | MULBOOT_MEMORY_INFO ; Combined flag field
MULBOOT_HEADER_MAGIC    equ  0x1BADB002        ; Multiboot magic signature
MULBOOT_HEADER_CHECKSUM equ -(MULBOOT_HEADER_MAGIC + MULBOOT_HEADER_FLAGS) ; Validates header (`MAGIC` + `FLAGS` + `CHECKSUM` == 0)

; Multiboot header section
; Must be located within the first 8 KiB of the kernel image at a 32-bit boundary.
; Bootloader parses this header to identify the image as a valid multiboot kernel.
section .multiboot
align 4
    dd MULBOOT_HEADER_MAGIC
    dd MULBOOT_HEADER_FLAGS
    dd MULBOOT_HEADER_CHECKSUM

    dd 0, 0, 0, 0, 0

    dd 0
    dd 1024
    dd 768
    dd 32

; Kernel stack definition.
; Stack grows downward. Reserve 16 KiB for early kernel stack usage.
; The stack must be 16-byte aligned (System V ABI requirement).
section .bss
align 16
stack_bottom:
resb 16384
stack_top:

; Kernel entry point.
; `_start` is set in the linker script as the entry symbol.
; The bootloader transfers control here in 32-bit protected mode.
; At this stage: interrupts and paging are disabled, GDT not yet configured.
section .boot
global _start:function (_start.end - _start)
_start:
    ; Initilizing Paging
    MOV ecx, (initial_page_dir - 0xC0000000)
    MOV cr3, ecx

    MOV ecx, cr4
    OR ecx, 0x10
    MOV cr4, ecx

    MOV ecx, cr0
    OR ecx, 0x80000000
    MOV cr0, ecx

    jmp higher_half_kernel
.end:

section .text
higher_half_kernel:
    ; Initialize stack pointer
    mov esp, stack_top

    ; used later in kernel_main
    push ebx ; Push the pointer to the Multiboot information structure
    push eax ; Push the magic value

    extern init_kernel
    call init_kernel

    sti

    ; Transfer control to the C/C++ kernel entry.
    ; Stack alignment is preserved (16-byte aligned at call time).
    extern kernel_main
    call kernel_main

; System halt loop.
; Ensures CPU remains idle when kernel_main returns (should never happen).
halt:
global halt:function (halt.end - halt)
    cli
.mh:
    hlt
    jmp .mh
.end:

section .data
align 4096
global initial_page_dir
initial_page_dir:
    DD 10000011b
    TIMES 768-1 DD 0

    DD (0 << 22) | 10000011b
    DD (1 << 22) | 10000011b
    DD (2 << 22) | 10000011b
    DD (3 << 22) | 10000011b
    TIMES 256-4 DD 0

OSDev Reference

For more on the Multiboot specification, see the OSDev Wiki: Multiboot or Wikipedia: Multiboot.


2. Low-Level Entry (boot.asm)

When GRUB jumps to the _start symbol, the CPU is in 32-bit Protected Mode, but paging is disabled.

Higher-Half Trampoline

TilekarOS is a Higher-Half Kernel, meaning it lives at virtual address 0xC0000000 (3GB). However, it is loaded at 0x00100000 (1MB) in physical RAM.

To bridge this gap, the entry code performs the following "trampoline" sequence: 1. Initial Page Directory: It sets up a temporary page directory (initial_page_dir). 2. Identity Mapping: It maps the first 4MB of physical memory to the first 4MB of virtual memory. This ensures the CPU doesn't crash immediately when paging is enabled. 3. Higher-Half Mapping: It maps the first 16MB of physical memory to virtual 0xC0000000. 4. Large Pages (4MB): It uses Page Size Extensions (PSE) to map 4MB blocks at once, simplifying the initial boot code.

flowchart TD
    %% --- Hardware / Bootloader ---
    subgraph SystemStart ["System Start"]
        direction TB
        BIOS["BIOS / UEFI"]:::hardware --> GRUB[GRUB Bootloader]:::hardware
        GRUB -- Multiboot Magic --> ProtMode["32-bit Protected Mode"]:::hardware
    end

    %% --- Low Level Entry ---
    subgraph BootASM ["boot.asm _start"]
        direction TB
        Paging["Enable Paging (4MB Pages)"]:::asm
        Stack["Setup Stack (esp = stack_top)"]:::asm
        Paging --> Stack
        Stack --> InitHub{Init Sequence}:::asm
    end

    SystemStart --> BootASM

    %% --- Modules ---

    %% TTY
    InitHub -- 1 --> TTY["tty.c: init_terminal"]:::c_module

    %% GDT
    subgraph GDT_Scope ["gdt.c: init_gdt"]
        direction LR
        G_Ld[Load GDTR]:::c_func --> G_TSS[Install TSS]:::c_func
        G_TSS --> G_TR[Load TR]:::c_func
    end
    InitHub -- 2 --> GDT_Scope

    %% IDT
    subgraph IDT_Scope ["idt.c: init_idt"]
        direction LR
        I_PIC["Remap PIC (0x20 / 0x28)"]:::c_func --> I_Fill[Fill Gates]:::c_func
        I_Fill --> I_Ld[Load IDTR]:::c_func
    end
    InitHub -- 3 --> IDT_Scope

    %% Kernel Main
    subgraph Kernel_Scope ["kernel.c: kernel_main"]
        direction TB
        AppLogic["Print Banner & Run Tests"]:::kernel --> Halt((Infinite Loop)):::hardware
    end
    InitHub -- 4 --> Kernel_Scope

    %% FORCE LEFT-TO-RIGHT ORDERING
    TTY ~~~ G_Ld ~~~ I_PIC ~~~ AppLogic

3. Linker Layout (linker.ld)

Source File: linker.ld

The linker script is responsible for the final organization of the binary. It defines where each section (text, data, bss) is placed in both physical and virtual memory.

Code Preview: linker.ld
/* The bootloader will look at this image and start execution at the symbol
   designated as the entry point. */
ENTRY(_start)

/* Tell the linker that we want the specific sections in the output file
   to be organized as follows: */
SECTIONS
{
    /* --- PHYSICAL MEMORY SETUP --- */

    /* The kernel will be loaded at 1MB into physical memory. */
    /* We skip the first 1MB to avoid BIOS routines, the VGA text buffer (0xB8000),
       and other hardware-reserved areas typically found in the lower 1MB. */
    . = 1M;

    /* The Multiboot header must appear very early in the binary (within the
       first 8 KiB) so the bootloader (like GRUB) can find it. */
    .multiboot :
    {
        /* Ensure the multiboot header stays at the beginning */
        *(.multiboot)
    }

    /* Sometimes the compiler adds a build-id note. We include it explicitly
       to prevent the linker from placing it somewhere that breaks the layout. */
    .note.gnu.build-id ALIGN(4K) :
    {
        *(.note.gnu.build-id)
    }

    /* The .boot section contains the "trampoline" code.
       Since paging is not yet enabled when the kernel starts, this code
       must execute from low physical memory (around 1MB). */
    .boot ALIGN(4K) :
    {
        *(.boot)
    }

    /* --- HIGHER HALF KERNEL SETUP --- */

    /* We now increment the location counter (.) by 3GB (0xC0000000).
       From this point on, all symbols will be assigned virtual addresses
       in the upper half of memory (e.g., 0xC0100000), but we still need
       to load them at physical address 0x00100000. */
    . += 0xC0000000;

    /* CODE SECTION (.text)
       ALIGN(4K): Align sections on 4KB page boundaries for easy mapping.
       AT(...):   This specifies the Load Memory Address (LMA).
                  We calculate the physical address by taking the current
                  Virtual Address (ADDR(.text)) and subtracting the 3GB offset. */
    .text ALIGN(4K) : AT(ADDR(.text) - 0xC0000000)
    {
        *(.text)
    }

    /* READ-ONLY DATA (.rodata)
       Contains constants, string literals, and C++ vtables. */
    .rodata ALIGN(4K) : AT(ADDR(.rodata) - 0xC0000000)
    {
        *(.rodata)
    }

    /* INITIALIZED DATA (.data)
       Contains global variables that have been assigned a value (e.g., int x = 5;). */
    .data ALIGN(4K) : AT(ADDR(.data) - 0xC0000000)
    {
        *(.data)
    }

    /* UNINITIALIZED DATA (.bss)
       Contains global variables that are zeroed or uninitialized (e.g., int x;).
       Also usually contains the kernel stack.
       *(COMMON) catches C variables that are "tentative" definitions. */
    .bss ALIGN(4K) : AT(ADDR(.bss) - 0xC0000000)
    {
        *(COMMON)
        *(.bss)
    }

    /* End of the kernel image marker. Useful for the memory manager (PMM)
       to know where free memory begins. */
    _kernel_end = .;
}

Under the Hood: VMA vs LMA

The LMA (Load Memory Address) is where the data is actually loaded in RAM (1MB). The VMA (Virtual Memory Address) is where the code expects to be when it runs (0xC0000000 + 1MB). The difference between the two is handled by the initial paging setup in boot.asm.


4. C Initialization (init_kernel.c)

Source File: init_kernel.c

Once in the higher half, the init_kernel function orchestrates the setup of all major subsystems. It takes the multiboot magic number and the pointer to the multiboot information structure as arguments.

Code Preview: init_kernel.c
#include "kernel/tty.h"
#include "gdt.h"
#include "idt.h"
#include "task.h"
#include "timer.h"
#include "keyboard.h"
#include "ata.h"
#include "ahci.h"
#include "pci.h"
#include "vfs.h"
#include <stdint.h>
#include <stdio.h>
#include "memory.h"
#include "kmalloc.h"

extern uint32_t _kernel_end;

/*
 * init_kernel
 * The main entry point for C kernel initialization.
 * Called from boot.asm after stack setup.
 */
void init_kernel(uint32_t magic, MultiBootInfo* boot_info) {
    (void)magic; // Suppress unused parameter warning

    // Convert boot_info to virtual address (higher half)
    // GRUB passes a physical address, which is currently identity mapped.
    // We must use the virtual address once identity mapping is removed.
    boot_info = (MultiBootInfo*)((uint32_t)boot_info + KERNEL_START);

    // Initialize core kernel subsystems
    init_terminal();
    init_gdt();
    init_idt();
    pci_init();
    init_keyboard();

    {
        /*
        * Memory Management Initialization
        *
        * 1. Calculate the end of the last loaded module (e.g., initrd).
        * If no modules are present, we use the end of the kernel image.
        */
        uint32_t module_end = (uint32_t)&_kernel_end;

        // Convert to physical address if it's a virtual address from the linker
        if (module_end >= KERNEL_START) {
            module_end -= KERNEL_START;
        }

        if (boot_info->flags & (1 << 3) && boot_info->mods_count > 0) {
            // mods_addr is a physical address
            uint32_t* mods = (uint32_t*)(boot_info->mods_addr + KERNEL_START);
            uint32_t first_mod_end = mods[1]; // end address of the first module
            if (first_mod_end > module_end) {
                module_end = first_mod_end;
            }
        }

        /*
        * 2. Align the start of free physical memory to the next 4KB page boundary.
        */
        uint32_t phys_alloc_start = (module_end + 0xFFF) & ~0xFFF;

        // 3. Initialize the physical memory manager and paging system.
        // boot_info->mem_upper is KB above 1MB. Total memory = 1024 + mem_upper.
        init_memory(1024 + boot_info->mem_upper, phys_alloc_start);
    }

    // 4. Initialize the kernel heap (1MB initial size).
    kmalloc_init(1024 * 1024);

    // Register devices and VFS now that heap is ready
    tty_register();
    keyboard_register();
    init_ata();
    init_ahci();
    vfs_init();

    // 5. init Kernel Task Scheduler
    printf("\nInitializing Multitasking...\n");
    task_init_scheduler();

    printf("Memory allocation initialized.\n");
}

Initialization Steps:

  1. VGA/TTY: Clears the screen and prepares for text output.
  2. GDT & IDT: Replaces the GRUB-provided table with our own to handle memory segments and interrupts.
  3. Keyboard: Initializes the PS/2 keyboard driver.
  4. Memory Management:
    • PMM: Calculates available RAM from the boot_info provided by GRUB.
    • Paging: Sets up recursive paging and removes the identity mapping.
  5. Heap: Initializes kmalloc with a 1MB pool.
  6. PCI Scan: Enumerate devices on the PCI bus to find IDE/ATA controllers.
  7. Storage & Filesystem:
    • ATA: Probes for hard disks and registers them in the device registry.
    • VFS: Registers standard I/O devices (tty0, kbd0) and mounts the root filesystem.
  8. Multitasking: Sets up the scheduler, the first "Main" task, and any initial user-mode processes.

5. Test/Example: Verifying Multiboot Info

You can verify that GRUB is passing correct information by checking the MultiBootInfo struct in init_kernel.c:

void test_multiboot(MultiBootInfo* boot_info) {
    if (boot_info->flags & MULTIBOOT_INFO_MEMORY) {
        printf("Lower Memory: %u KB\n", boot_info->mem_lower);
        printf("Upper Memory: %u KB\n", boot_info->mem_upper);
    }
    if (boot_info->flags & MULTIBOOT_INFO_BOOT_LOADER_NAME) {
        printf("Bootloader: %s\n", (char*)boot_info->boot_loader_name + KERNEL_START);
    }
}

References