STATIC.x

Kunal Dawn

Device Drivers, Part 16: Kernel Window — Peeping through /proc

Leave a comment

After many months, Shweta and Pugs got together for some peaceful technical romancing. All through, they had been using all kinds of kernel windows, especially through the /proc virtual filesystem (usingcat), to help them decode various details of Linux device drivers. Here’s a non-exhaustive summary listing:

  • /proc/modules — dynamically loaded modules
  • /proc/devices — registered character and block major numbers
  • /proc/iomem — on-system physical RAM and bus device addresses
  • /proc/ioports — on-system I/O port addresses (especially for x86 systems)
  • /proc/interrupts — registered interrupt request numbers
  • /proc/softirqs — registered soft IRQs
  • /proc/kallsyms — running kernel symbols, including from loaded modules
  • /proc/partitions — currently connected block devices and their partitions
  • /proc/filesystems — currently active filesystem drivers
  • /proc/swaps — currently active swaps
  • /proc/cpuinfo — information about the CPU(s) on the system
  • /proc/meminfo — information about the memory on the system, viz., RAM, swap, …

Custom kernel windows

“Yes, these have been really helpful in understanding and debugging Linux device drivers. But is it possible for us to also provide some help? Yes, I mean can we create one such kernel window through /proc?” asked Shweta.

“Why just one? You can have as many as you want. And it’s simple — just use the right set of APIs, and there you go.”

“For you, everything is simple,” Shweta grumbled.

“No yaar, this is seriously simple,” smiled Pugs. “Just watch me creating one for you,” he added.
And in a jiffy, Pugs created the proc_window.c file below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/proc_fs.h>
#include <linux/jiffies.h>
static struct proc_dir_entry *parent, *file, *link;
static int state = 0;
int time_read(char *page, char **start, off_t off, int count, int *eof, void *data) {
    int len, val;
    unsigned long act_jiffies;
    len = sprintf(page, "state = %d\n", state);
    act_jiffies = jiffies - INITIAL_JIFFIES;
    val = jiffies_to_msecs(act_jiffies);
    switch (state) {   
        case 0:
            len += sprintf(page + len, "time = %ld jiffies\n", act_jiffies);
            break;
        case 1:
            len += sprintf(page + len, "time = %d msecs\n", val);
            break;
        case 2:
            len += sprintf(page + len, "time = %ds %dms\n",
                    val / 1000, val % 1000);
            break;
        case 3:
            val /= 1000;
            len += sprintf(page + len, "time = %02d:%02d:%02d\n",
                    val / 3600, (val / 60) % 60, val % 60);
            break;
        default:
            len += sprintf(page + len, "<not implemented>\n");
            break;
    }
    len += sprintf(page + len, "{offset = %ld; count = %d;}\n", off, count);
    return len;
}
int time_write(struct file *file, const char __user *buffer, unsigned long count, void *data) {
    if (count > 2)
        return count;
    if ((count == 2) && (buffer[1] != '\n'))
        return count;
    if ((buffer[0] < '0') || ('9' < buffer[0]))
        return count;
    state = buffer[0] - '0';
    return count;
}
static int __init proc_win_init(void) {
    if ((parent = proc_mkdir("anil", NULL)) == NULL) {
        return -1;
    }
    if ((file = create_proc_entry("rel_time", 0666, parent)) == NULL) {
        remove_proc_entry("anil", NULL);
        return -1;
    }
    file->read_proc = time_read;
    file->write_proc = time_write;
    if ((link = proc_symlink("rel_time_l", parent, "rel_time")) == NULL) {
        remove_proc_entry("rel_time", parent);
        remove_proc_entry("anil", NULL);
        return -1;
    }
    link->uid = 0;
    link->gid = 100;
    return 0;
}
static void __exit proc_win_exit(void) {
    remove_proc_entry("rel_time_l", parent);
    remove_proc_entry("rel_time", parent);
    remove_proc_entry("anil", NULL);
}
module_init(proc_win_init);
module_exit(proc_win_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>");
MODULE_DESCRIPTION("Kernel window /proc Demonstration Driver");

And then Pugs did the following:

  • Built the driver file (proc_window.ko) using the usual driver’s Makefile.
  • Loaded the driver using insmod.
  • Showed various experiments using the newly created proc windows. (Refer to Figure 1.)
  • And finally, unloaded the driver using rmmod.

Peeping through /proc

Figure 1: Peeping through /proc

Demystifying the details

Starting from the constructor proc_win_init(), three proc entries have been created:

  • Directory anil under /proc (i.e., NULL parent) with default permissions 0755, usingproc_mkdir()
  • Regular file rel_time in the above directory, with permissions 0666, usingcreate_proc_entry()
  • Soft link rel_time_l to the file rel_time, in the same directory, using proc_symlink()

The corresponding removal of these is done with remove_proc_entry() in the destructor,proc_win_exit(), in chronological reverse order.

For every entry created under /proc, a corresponding struct proc_dir_entry is created. For each, many of its fields could be further updated as needed:

  • mode — Permissions of the file
  • uid — User ID of the file
  • gid — Group ID of the file

Additionally, for a regular file, the following two function pointers for reading and writing over the file could be provided, respectively:

  • int (*read_proc)(char *page, char **start, off_t off, int count, int *eof, void *data)
  • int (*write_proc)(struct file *file, const char __user *buffer, unsigned long count, void *data)

write_proc() is very similar to the character driver’s file operation write(). The above implementation lets the user write a digit from 0 to 9, and accordingly sets the internal state.read_proc() in the above implementation provides the current state, and the time since the system has been booted up — in different units, based on the current state. These are jiffies in state 0; milliseconds in state 1; seconds and milliseconds in state 2; hours, minutes and seconds in state 3; and <not implemented> in other states.

And to check the computation accuracy, Figure 2 highlights the system uptime in the output of top. read_proc‘s page parameter is a page-sized buffer, typically to be filled up with count bytes from offset off. But more often than not (because of less content), just the page is filled up, ignoring all other parameters.

Comparison with top’s output

Figure 2: Comparison with top’s output

All the /proc-related structure definitions and function declarations are available through<linux/proc_fs.h>. The jiffies-related function declarations and macro definitions are in<linux/jiffies.h>. As a special note, the actual jiffies are calculated by subtractingINITIAL_JIFFIES, since on boot-up, jiffies is initialised to INITIAL_JIFFIES instead of zero.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s