Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Open Source

Implementing Loadable Kernel Modules for Linux


MAY95: Implementing Loadable Kernel Modules for Linux

Implementing Loadable Kernel Modules for Linux

Loading and unloading kernel modules on a running system

Matt Welsh

Matt works with the Cornell University robotics and vision laboratory. He is the author of two books on the Linux operating system, including Running Linux (O'Reilly & Associates, 1995), and he is a contributor to Linux Journal. He can be contacted at [email protected].


The Linux operating system, a freely distributed UNIX clone developed via the Internet, is an excellent platform for operating-systems research and development. In fact, you can inspect, modify, and experiment with any part of the system because the entire source code for the kernel, all of the basic system utilities, and the libraries are freely available (they're covered by the GNU General Public License). Linux runs primarily on PCs with Intel 386/486/Pentium processors, but ports are in the works for architectures such as the DEC Alpha, Motorola 68000, PowerPC, and more. Apart from its versatility for systems research, Linux is also a very stable and useful UNIX implementation for the PC: It supports software such as emacs, the X Window System, gcc, and much more (see the accompanying text box entitled, "Getting Linux").

One of the most important recent developments for Linux is dynamically loaded kernel modules. The Linux kernel design is similar to that of classic UNIX systems: It uses a monolithic architecture with file systems, device drivers, and other pieces statically linked into the kernel image to be used at boot time. While there are currently no plans to restructure Linux around the microkernel architecture, the use of dynamic kernel modules allows you to write portions of the kernel as separate objects that can be loaded and unloaded on a running system.

In this article, I'll describe the dynamic-kernel-module implementation for Linux, concentrating on the steps required to load a module on a running system. The Linux implementation is fairly straightforward and could be adapted on other UNIX systems that don't already provide this functionality. Surprisingly enough, most of the necessary support is found not within the kernel itself, but in the run-time loader.

Overview

A kernel module is simply an object file containing routines and/or data to load into a running kernel. (If multiple source files are used to build a module, the corresponding object files can be prelinked into a single object using ld --r.) When loaded, the module code resides in the kernel's address space and executes entirely within the context of the kernel. Technically, a module can be any set of routines, with the one restriction that two functions, init_module() and cleanup_module(), must be provided. The first is executed once the module is loaded, and the second, before the module is unloaded from the kernel. Of course, programmers must also observe all of the precautions and conventions used by kernel-level code when writing modules.

Loading a module into the kernel involves four major tasks:

  • Preparing the module in user space. Read the object file from disk and resolve all undefined symbols. A module may access only those modules already in the running kernel. This "linking" step takes place within the run-time module loader, a utility that runs in user space.
  • Allocating memory in the kernel address space to hold the module text, data, and other relevant information.
  • Copying the module code into this newly allocated space and provide the kernel with any information necessary to maintain the module (such as the new module's own symbol table).
  • Executing the module initialization routine, init_module() (now in the kernel).
Because the first step is the most complex, I'll focus on it in this article. Once all external references in the module have been resolved, it can easily be copied into space set aside by the kernel and executed from there.

There are several important issues to address when using the approach I've just described, the first being symbol resolution. All of the external symbols that the module can reference correspond to variables or routines in the kernel. Symbols can be either "resident" (compiled into the original kernel image) or "transient" (provided by other modules that are already loaded). For the module loader to resolve all of these references, the kernel must provide a list of valid symbols, copied to the module loader via a system call. Instead of allowing modules to access and use all resident symbols, the kernel provides a list of those variables and routines "stable" enough for modules to employ. Otherwise, modules could depend too heavily on low-level aspects of the kernel code, and thus break if those symbols were to change. Currently, this is a static list found in one of the kernel source files, but changes are planned to allow each portion of the kernel to provide its own part of the resident symbol list. Similarly, when each module is loaded, it must contribute a symbol table with entries for each symbol that it will provide to other modules.

Another issue to consider is intermodule dependencies. The current implementation requires that modules with such dependencies be loaded in a particular order; that is, for module A to use a symbol defined by module B, module B must be loaded first. Similarly, the kernel must maintain reference lists for each module so that a module cannot be unloaded until all modules referencing it have been unloaded themselves. This mechanism can be used to build module stacks.

A third issue is version coherency. The system must be able to guarantee that all symbols and data structures used in a module are identical to those used in the running kernel. If the kernel's definition of a data structure differs from that used in a module, the module could corrupt important kernel data and real havoc could result. To deal with this, the current implementation requires that modules only be loaded against the version of the kernel that was running when they were compiled. Data from uname(2) is stored in the module itself at compile time, and when loaded, this data must match the data in the currently running kernel. At the time of this writing, a new design is being tested which assigns version information individually to kernel symbols. Although the current paranoid approach can be annoying to those who rebuild kernels often, it is nearly foolproof.

The Module Loader

The Linux module loader, insmod, is responsible for reading an object file to be used as a module, obtaining the list of kernel symbols, resolving all references in the module with these symbols, and sending the module to the kernel for execution. While insmod has many features, it is most commonly invoked as insmod module.o, where module.o is an object file corresponding to a module. In walking through the steps that insmod uses to load a module, I'll point out the data structures and function prototypes used. (The entire source is far too long to print here; source for insmod and related utilities, as well as the entire Linux kernel, are available freely. See the accompanying text box entitled "Getting Linux.")

Step 1. Open the object file and read it piece by piece. Linux systems use the classic a.out object-file format (although ELF support is becoming available, and the newest versions of the module utilities support it). The data structures used by insmod, defined in both insmod and <a.out.h>, are in Listing One . a.out format object files are stored as a header, followed by text and data segments, relocation information, the symbol table, and the string table; see Figure 1. Each portion of the file is read and stored by insmod for later use.

The symbol table is stored in the object file as an array of struct nlist. The symbol names are actually found in the string table, located immediately after the symbol table in the file. Each symbol-table entry contains the offset (into the string table) of the associated name in the n_un.n_strx member.

Step 2. Read the symbol table. Within insmod, each symbol is read and inserted into a binary tree (actually, a splay tree) to make symbol lookups for relocation more efficient. The addresses of the symbols _init_module and _cleanup_module, the module initialization and deletion functions, are saved for later use.

Step 3. Resolve external references. insmod obtains the list of resident and transient kernel symbols using the get_kernel_syms() system call; see Listing Two, page 96. This call returns an array of struct kernel_sym, each entry of which contains the name and kernel address of a kernel symbol. If the name field begins with the # character, the address field contains the kernel address of a struct module describing a previously loaded module. The entries following those referring to a struct module contain the names and addresses of symbols in that module. Kernel-resident symbols are followed by a "dummy" entry with the name field #.

For example, let's say that two modules, gonzo and alice, were loaded in that order. gonzo provides the transient symbols _gonzo_1 and _gonzo_2, while alice provides the symbol _alice_3. The kernel symbol table will look like Table 1.

Note that modules are listed in reverse order of loading and that kernel-resident symbols are listed last. This property allows modules to override symbols provided by previously loaded modules or the kernel itself. The entries containing addresses to struct module precede the symbols from the corresponding module and allow insmod to keep track of the modules referenced by the module to be loaded.

Once the kernel symbols have been obtained, insmod looks up each one in the splay tree constructed from external references made by the module, updating the n_value member of each struct nlist entry in the tree as it is found with the actual symbol value, which is a kernel address. If any references in the tree are not resolved with the data from get_kernel_syms(), insmod complains of an undefined symbol and exits.

Step 4. Relocate with kernel addresses by updating the addresses in the text and data segments of the module that use the symbol values obtained from the kernel. Sixty-four bits are stored for each address to be reloaded in a struct relocation_info; see Listing One. Each address that refers to an external symbol is updated using the relocation information, which is stored in the object file after the data segment. Once this is complete, all external references in the module point to the correct kernel addresses.

Step 5. Allocate kernel memory for the module by calling the create_module() system call; see Listing Two. Pass create_module() the name of the module (a string generated from the name of the object file; for example, the module gonzo.o will have the name gonzo and the total size of the module--the sum of the text, data, and BSS segment sizes).

Step 6. Load the module into kernel memory using the init_module() system call; see Listing Two. This call takes a number of arguments, and insmod must build up the associated data structures before making the call. The parameters include the module name, the code (just a character array containing the text and data segments), the size of the module code in bytes, a struct mod_routines containing pointers to the module's init_module and cleanup_module routines, and a struct symbol_table that describes the symbols exported by the module and the other modules referenced. The struct symbol_table is constructed from the symbol tree built up by insmod. As the loader resolves references to kernel symbols, it keeps track of the modules referenced in a list that becomes part of the struct symbol_table parameter. Note that the init_module() system call is not the same as the init_module() routine provided by the module.

This summarizes insmod operation. The program also includes many other options; see the code and associated man pages for details.

Kernel Details

Most of the work involved in loading a module takes place within insmod. However, it is instructive to look at the implementation of the various system calls used by the loader.

First, the struct module is located by name on the linked list of modules maintained by the kernel. The module code is copied from user space into the memory allocated by create_module, and the BSS portion is zeroed out. The module cleanup-routine address is stored in the struct module for later use.

Next, the list of transient symbols to export for later module loads is updated with the information contained in the struct symbol_table parameter. The size element of this structure is read, kernel memory allocated, and the structure copied into the kernel. Sanity checking is done to ensure that the fields of this structure make sense.

The format of the struct symbol_table passed to init_module() is shown in Figure 2. Note that the string table is stored immediately after the symbol table itself in the memory passed to init_module(). struct symbol_table contains an array of struct internal_symbols, with each entry holding the name and address of a symbol exported by the module. The name field of this structure is actually an offset address from the beginning of the structure, pointing to the location of the actual string stored after the symbol table in user memory. The string table doesn't show up on the definition of struct symbol_table, but it's there. The size element, >, includes the size of the string table. In this way, the symbol table and associated names can be copied from a single block of user memory. After copying the data, the kernel updates each name field with the base address of the newly allocated symbol table, so that the absolute address for each name will be correct.

The last step is to update the list of references to other modules. The struct module_ref array contains pointers to modules being referenced by the new module. The kernel adds the new module to the dependency list for each referenced module, after checking that each such module is in fact loaded. In other words, each referenced module points back to the module being loaded. The kernel won't allow a module to be unloaded unless this dependency list is empty.

Once this is complete, the kernel executes the module's own init_module() function. If this succeeds, the module's state flag is set to MOD_RUNNING and the system call returns 0. The module is now loaded in the kernel and its code and data accessed accordingly. If at any point the loading procedure fails, the module memory is freed and an appropriate error code returned.

The module's init_module() routine is generally used to initialize the appropriate hooks that the rest of the kernel needs to access functions provided by the module. For example, in the case of a device driver written as a module, init_module() would register module routines in the table of callbacks required for each device.

  • delete_module(). This system call takes a single argument: the name of the module to delete. It simply searches for the corresponding struct module by name. If no modules reference the module to be deleted, the cleanup_module routine is called, and all memory associated with the module is freed. References to other modules are also cleared. The user program rmmod invokes this system call to unload modules.

New Features

This implementation corresponds to the module utilities for Linux kernel Version 1.1.67. As with many aspects of Linux, this code is constantly under development, and new features are added weekly.

The newest version of the module utilities (for 1.1.85) includes support for more-intelligent tracking of symbol versions. Instead of requiring modules to run on the kernel under which they were compiled, version information is attached to each kernel-resident symbol. While the kernel is compiled, the source file ksyms.c, which contains a list of resident symbols to export, is processed with gcc --E to expand the declarations of functions and data structures. A 32-bit CRC checksum is generated from each expanded symbol, and the output of each symbol name along with the CRC is written to the file /usr/include/linux/modules/ksyms.ver. This checksum will change if the declaration of the associated kernel symbol changes.

When individual modules are compiled, the symbol names and checksums in this file are stored in a table. When insmod loads a module, the call to get_kernel_syms() returns the list of kernel symbols, as before, along with the CRC used in the running kernel. insmod checks that each checksum used when the module was compiled corresponds to the checksum in the running kernel. If the checksums don't match for any symbol, insmod won't allow the module to be loaded.

There you have it--loadable kernel modules. Most of the code is quite straightforward, but certain issues might not be so obvious. Again, I invite readers interested in this design (and in improving upon it!) to grab the code. Anyone is welcome to contribute to the development effort.

I'd like to thank the people responsible for the development of the module code: Jon Tombs, Bas Laarhoven, Jacques Gelinas, Jeremy Fitzhardinge, and especially Björn Ekwall, who gave me a great deal of information and help in preparing this article.

Getting Linux

Linux is a popular, free, UNIX-like operating system for PCs. It currently runs on 80386, 80486, and Pentium PCs, with ports for other systems underway.

If you're interested in obtaining Linux or learning more, there are a number of places to look. If you have Internet access, the ftp site sunsite.unc.edu:/pub/Linux/docs is the first place to go. Users with WWW access can look at the URL http://sunsite.unc.edu/mdw/linux.html. The first documents to read are the Linux Frequently Asked Questions list, the INFO-SHEET (which gives a technical introduction to the system), and the META-FAQ (which outlines the other documents available). Others to look at include the "HOWTO" documents that each detail a particular aspect of the system, such as installation or network configuration, and the Linux Documentation Project manuals. All of these are at the aforementioned Linux FTP and WWW addresses given.

To obtain Linux from the Internet, you need to select a "distribution," a set of ready-to-install software packages. The most popular distribution is Slackware, which can be obtained via ftp from sunsite.unc.edu:/pub/Linux/distributions/slackware and consists of a set of diskette images that you download and use to install the software on your own system. Linux is installed on its own partitions on your drives, and it exists independently of other operating systems such as MS-DOS, Windows, or OS/2. The Linux Installation HOWTO document describes how to obtain and install this distribution.

Linux, and much of the software that it supports, are covered by the GNU General Public License, which allows vendors to sell the software, and Linux is available from a number of software companies, usually on CD-ROM. The Linux Developer's Resource, a CD-ROM set from InfoMagic (800-800-6613, [email protected]) has the contents of the Linux ftp sites, several distributions, and documentation; it's updated every few months. If you don't have Internet access, this is a good place to start.

There are several books about Linux. I wrote the Linux Documentation Project manual, Linux Installation and Getting Started. It is available via the Internet and from many commercial Linux vendors (including InfoMagic). The Linux Bible is published by Yggdrasil ([email protected]) and contains all of the manuals and HOWTOs from the Linux Documentation Project in one book. Linux: Unleashing the Workstation in Your PC has been published by Springer-Verlag. My book, Running Linux, is available from O'Reilly & Associates.

The code described in this article is part of the Linux kernel sources, which are available on any Linux system. Alternatively, you can grab the current kernel source tree from sunsite.unc.edu:/pub/Linux/kernel/VERSION, where VERSION is the latest version of the kernel. (By the time you read this, v1.2 will be the "stable" kernel version, with new development continuing on v1.3.) The file linux/kernel/module.c in this tar file contains most of the kernel-level module code. The file modules-1.1.67.tar.gz contains the module utilities (insmod, rmmod, and so on) and some documentation. Again, a newer version of this package will be available when you read this.

--M.W.

Table 1: Example of kernel symbols returned by get_kernel_syms().

Name             Address
<I>#alice</I>         <I>struct</I> module describing alice.
<I>_alice_3</I>       <I>alice_3.</I>
<I>#gonzo</I>         <I>struct</I> module describing gonzo.
<I>_gonzo_1</I>       <I>gonzo_1.</I>
<I>_gonzo_2</I>       <I>gonzo_2.</I>
<I>#</I>              Dummy <I>struct</I> module for resident symbols.
<I>_verify_area</I>   Resident symbol <I>verify_area.</I>
<I>_do_mmap</I>       Resident <I>symbol do_mmap.</I>

Figure 1 a.out object-file format. Figure 2 struct symbol_table passed to init_module().

Listing One


/* Header for a.out object and executable files. */
/* Data structures used for loading modules. */
struct exec {
    unsigned long a_info;  /* Describes object file. */
    unsigned a_text;       /* Length of text segment in bytes */
    unsigned a_data;       /* Length of data segment */
    unsigned a_bss;        /* Length of BSS segment */
    unsigned a_syms;       /* Length of symbol table */
    unsigned a_entry;      /* Entry point address */
    unsigned a_trsize;     /* Length of text relocation info */
    unsigned a_drsize;     /* Length of data relocation info */
};
/* The object file symbol table is an array of struct nlist. */
struct nlist {
    union {
      /* Only one of the following are available, based on
       * context. E.g., n_strx is used when the data is stored
       * in a file, n_name when in core.
       */
      char *n_name;          /* Symbol name */
      struct nlist *n_next;  /* Next symbol in list */
      long n_strx;           /* Index to 
    } n_un;
    unsigned char n_type;    /* Type of symbol, e.g., text or data */
    char n_other;            /* Unused by the system, but useful for insmod */
    short n_desc;            /* Used by symbolic debuggers */
    unsigned n_value;        /* Address of this symbol */
};
/* Binary tree of symbols used for relocation. Defined by insmod. */
struct symbol {
    struct nlist n;
    struct symbol *child[2];
};
/* Relocation information stored in the module object file */
struct relocation_info {
    int r_address;               /* Address to be relocated */
    unsigned int r_symbolnum:24; /* Index of symbol in symbol table */
    unsigned int r_pcrel:1;      /* 1 for PC-relative offset */
    unsigned int r_length:2;     /* Relocate (1<<r_length) bytes */
    unsigned int r_extern:1;     /* 1 if relocating with addr of symbol */
    unsigned int r_pad:4;        /* Unused */
};


Listing Two


/* Obtains list of symbols from kernel for module relocation */
/* System calls and data structures used by insmod. */
int get_kernel_syms(struct kernel_sym *table);

/* Allows kernel to allocate space for module */
int create_module(char *module_name, unsigned long size);

/* Sends module code and data to kernel, as well as init/cleanup
 * routines and symbol table used by module. */
int init_module(char *module_name, char *code, unsigned codesize,
  struct mod_routines *routines, struct symbol_table *symtab);

/* Removes module from kernel */
int delete_module(char *module_name);

/* An array of struct kernel_sym is returned by get_kernel_syms */
struct kernel_sym {
  unsigned long value;       /* Symbol value */
  char name[SYM_MAX_NAME];   /* Symbol name */
};
/* The init and cleanup functions provided by the module */
struct mod_routines {
  int (*init)(void);         /* Module init routine */
  void (*cleanup)(void);     /* Module cleanup routine */
};
/* Symbol table passed to init_module */
struct symbol_table {
  int size;                  /* Total size, including string table */
  int n_symbols;             /* Number of symbols */
  int n_refs;                /* Number of module references */
  struct internal_symbol symbol[0];  /* Array of symbols; space
                                      * allocated elsewhere */
  struct module_ref ref[0];          /* Array of module references */
};
/* Symbols provided by the module */
struct internal_symbol {
  void *addr;               /* Address of symbol */
  char *name;               /* Name of symbol */
};
/* Reference to another module. */
struct module_ref {
  struct module *module;   /* Module referenced */
  struct module_ref *next; /* Next module in list */
};
/* Kernel data structure describing a module */
struct module {
  struct module *next;         /* Next module in list */
  struct module_ref *ref;      /* List of modules referring to this one */
  struct symbol_table *symtab; /* Symbol table given to init_module */
  char *name;                  /* Name of module */
  int size;                    /* Size of module in (4K) pages */
  void* addr;                  /* Address of module code in kernel */
  int state;                   /* State (running, deleted, uninitialized) */
  void (*cleanup)(void);       /* Cleanup function */
};


Copyright © 1995, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.