Continue with Updating Kernel in Lucid, I want to decrease overview build time this time. My benchmark is run in Ubuntu 10.04 installed in Virtualbox. My CPU is i5-2540M at 2.6GHz.

I’m learning kernel code these days. A minimal kernel will save a lot of build time. As you see, it took 64min to build 2772 modules when running oldconfig target:

Build Time Build Modules Package Size
oldconfig 64min 2772 33MB
localmodconfig 16min 244 7MB
localmodconfig + ccache 1st time 19min 244 7MB
localmodconfig + ccache 2nd time 7min 244 7MB

Fortunately, a new build target localmodconfig was added in kernel 2.6.32 that just helps:

It runs “lsmod” to find all the modules loaded on the current running system. It will read all the Makefiles to map which CONFIG enables a module. It will read the Kconfig files to find the dependencies and selects that may be needed to support a CONFIG. Finally, it reads the .config file and removes any module “=m” that is not needed to enable the currently loaded modules. With this tool, you can strip a distro .config of all the unuseful drivers that are not needed in our machine, and it will take much less time to build the kernel.

The build time was dramatically decreased to 16min to build only 244 modules. It could still boot my VM to desktop, and everything was working fine. However, it failed to mount an *.iso file, since the module was not in lsmod when building I think. To use localmodconfig target, run:

It may end up with errors. Please ignore, a new .config file is already generated. Then remember to turn off the CONFIG_DEBUG_KERNEL option in the .config file, as mentioned in my previous article.

Then ccache is used. I downloaded the source code and built myself, since the 3.x version seems to be faster than 2.4.x version:

Default prefix(/usr/local) is used here. Last 2 lines created symbolic links(named as the compiler) to ccache, to let ccache masquerade as the compiler. This is suggested in ccache’s man page.

So why bother a compiler cache? The makefile doesn’t work?

If you ever run “make clean; make” then you can probably benefit from ccache. It is very common for developers to do a clean build of a project for a whole host of reasons, and this throws away all the information from your previous compiles. By using ccache you can get exactly the same effect as “make clean; make” but much faster. Compiler output is kept in $HOME/.ccache, by default.

The first run creates the cache, and the second benefits from the cache. That’s it.

To display ccache statistics, run:

First, from Intel’s manuals 3A 9.1.4:

The first instruction that is fetched and executed following a hardware reset is located at physical address FFFFFFF0H. This address is 16 bytes below the processor’s uppermost physical address. The EPROM containing the software-initialization code must be located at this address.

The address FFFFFFF0H is beyond the 1-MByte addressable range of the processor while in real-address mode. The processor is initialized to this starting address as follows. The CS register has two parts: the visible segment selector part and the hidden base address part. In real-address mode, the base address is normally formed by shifting the 16-bit segment selector value 4 bits to the left to produce a 20-bit base address. However, during a hardware reset, the segment selector in the CS register is loaded with F000H and the base address is loaded with FFFF0000H. The starting address is thus formed by adding the base address to the value in the EIP register (that is, FFFF0000 + FFF0H = FFFFFFF0H).

The first time the CS register is loaded with a new value after a hardware reset, the processor will follow the normal rule for address translation in real-address mode(that is, [CS base address = CS segment selector * 16]). To insure that the base address in the CS register remains unchanged until the EPROM based software-initialization code is completed, the code must not contain a far jump or far call or allow an interrupt to occur (which would cause the CS selector value to be changed).

Two screenshots showing instructions in address FFFFFFF0H and FFFF0H(Shadow BIOS, see below) and their jumps. The first one is showing a AMI BIOS, while the second Phoenix BIOS. High BIOS of AMI directly jumps to the shadowed one, and both high and shadowed one jump to the same address. But High BIOS of Phoenix just keeps running in high addresses. The first instruction of both BIOS after all jumps is FAh, say cli(disable interrupts). I’m not going to do more reverse engineering. 🙂

NOTE: Main memory is not initialized yet at this time. From here:

The motherboard ensures that the instruction at the reset vector is a jump to the memory location mapped to the BIOS entry point. This jump implicitly clears the hidden base address present at power up. All of these memory locations have the right contents needed by the CPU thanks to the memory map kept by the chipset. They are all mapped to flash memory containing the BIOS since at this point the RAM modules have random crap in them.

The reset vector is simply FFFFFFF0h. Now, POST is started as described here:

POST stands for Power On Self Test. It’s a series of individual functions or routines that perform various initialization and tests of the computers hardware. BIOS starts with a series of tests of the motherboard hardware. The CPU, math coprocessor, timer IC’s, DMA controllers, and IRQ controllers. The order in which these tests are performed varies from motherboard to motherboard. Next, the BIOS will look for the presence of video ROM between memory locations C000:000h and C780:000h. If a video BIOS is found, It’s contents will be tested with a checksum test. If this test is successful, the BIOS will initialize the video adapter. It will pass controller to the video BIOS, which will inturn initialize itself and then assume controller once it’s complete. At this point, you should see things like a manufacturers logo from the video card manufacturer video card description or the video card BIOS information. Next, the BIOS will scan memory from C800:000h to DF800:000h in 2KB increments. It’s searching for any other ROM’s that might be installed in the computer, such as network adapter cards or SCSI adapter cards. If a adapter ROM is found, it’s contents are tested with a checksum test. If the tests pass, the card is initialized. Controller will be passed to each ROM for initialization then the system BIOS will resume controller after each BIOS found is done initializing. If these tests fail, you should see a error message displayed telling you “XXXX ROM Error”. The XXXX indicates the segment address where the faulty ROM was detected. Next, BIOS will begin checking memory at 0000:0472h. This address contains a flag which will tell the BIOS if the system is booting from a cold boot or warm boot. A value of 1234h at this address tells the BIOS that the system was started from a warm boot. This signature value appears in Intel little endian format, that is, the least significant byte comes first, they appear in memory as the sequence 3412. In the event of a warm boot, the BIOS will will skip the POST routines remaining. If a cold start is indicated, the remaining POST routines will be run.

NOTE: Main memory is initialized in POST. Main part of memory initialization code is complicated, and is directly provided by Intel which is known as MRC(Memory Reference Code).

There’s one step in POST called BIOS Shadowing:

Shadowing refers to the technique of copying BIOS code from slow ROM chips into faster RAM chips during boot-up so that any access to BIOS routines will be faster. DOS and other operating systems may access BIOS routines frequently. System performance is greatly improved if the BIOS is accessed from RAM rather than from a slower ROM chip.

A DRAM control register PAM0(Programmable Attribute Map) makes it possible to independently redirect reads and writes in the BIOS ROM area to main memory. The idea is to allow for RAM shadowing which allows read-access for ROMs to come from main memory whereas writes will continue to go to ROMs. Refer to Intel’s MCH datasheet for details:

This register controls the read, write, and shadowing attributes of the BIOS area from 0F0000h–0FFFFFh. The (G)MCH allows programmable memory attributes on 13 Legacy memory segments of various sizes in the 768 KB to 1 MB address range. Seven Programmable Attribute Map (PAM) Registers are used to support these features. Cacheability of these areas is controlled via the MTRR registers in the processor.

Big real mode(or unreal mode) is used to address more memory beyond 1M, as BIOS ROMs becomes larger and larger. In big real mode, one or more data segment registers have been loaded with 32-bit addresses and limits, but code segment stays unchanged:

Real Mode Big Real Mode Protected Mode
Code segment(cs) 1M 1M 4G
Data segments(ds, es, fs, gs) 1M 4G 4G

Protected mode can also refer 4G memory. But BIOS is mainly written for real mode, big real mode is a better choice for addressing.

Then, BIOS continues to  find a bootable device, see wikipedia:

The BIOS selects candidate boot devices using information collected by POST and configuration information from EEPROM, CMOS RAM or, in the earliest PCs, DIP switches. Option ROMs may also influence or supplant the boot process defined by the motherboard BIOS ROM. The BIOS checks each device in order to see if it is bootable. For a disk drive or a device that logically emulates a disk drive, such as a USB Flash drive or perhaps a tape drive, to perform this check the BIOS attempts to load the first sector (boot sector) from the disk to address 7C00 hexadecimal, and checks for the boot sector signature 0x55 0xAA in the last two bytes of the sector. If the sector cannot be read (due to a missing or blank disk, or due to a hardware failure), or if the sector does not end with the boot signature, the BIOS considers the disk unbootable and proceeds to check the next device. Another device such as a network adapter attempts booting by a procedure that is defined by its option ROM (or the equivalent integrated into the motherboard BIOS ROM). The BIOS proceeds to test each device sequentially until a bootable device is found, at which time the BIOS transfers control to the loaded sector with a jump instruction to its first byte at address 7C00 hexadecimal (1 KiB below the 32 KiB mark).

After all of above, BIOS initialization is finished. It’s your turn to take control of your system from address 0000:7c00!!

Why this address? It’s not defined by Intel nor Microsoft. It was decided by IBM PC 5150 BIOS developer team(David Bradley). See here:

BIOS developer team decided 0x7C00 because:

– They wanted to leave as much room as possible for the OS to load itself within the 32KB.
– 8086/8088 used 0x0 – 0x3FF for interrupts vector, and BIOS data area was after it.
– The boot sector was 512 bytes, and stack/data area for boot program needed more 512 bytes.
– So, 0x7C00, the last 1024B of 32KB was chosen.

Ubuntu Lucid(10.04) originally ships with 2.6.32 kernel. But on my T420 thinkpad, the wireless card is not recognized and graphics card is not functional well. Then I switched to 2.6.38 backport kernel, and installed bumblebee package to utilize the Nvidia Optimus Technology. Now the 3.0.0-16 backport kernel is out, it contains the fix for “rework ASPM disable code”, and it should do a better job in power saving even using the discrete Nvidia card. Moreover, it’s the new LTS kernel, so I decided to update to the 3.0 kernel. Please follow the steps if you are interested:

1. Add X-Updates PPA

These commands install official nvidia driver. Currently, it’s the 295.20 version.

2. Enable Nvidia Driver

This will let you to choose opengl engines. Select nvidia over mesa. This will also enable nvidia Xorg drivers, blacklist nouveau driver and add nvidia-xconfig into /usr/bin. You may find warnings like:

Just ignore them, seems to be safe.

This will generate new /etc/X11/xorg.conf file for your Nvidia card. If you cannot find the command, the original location is: /usr/lib/nvidia-current/bin/nvidia-xconfig

3. Fix ld Bindings

This just add an ld path into /etc/, otherwise, glx module cannot be loaded correctly. Here’s the /etc/log/Xorg.0.log segments:

Now, update ld runtime bindings and reboot.

4. Verify

If your installation is successful, the output looks like:

After installing the driver, hedgewars shows 120fps. While it used to show 4fps. It’s a great improvement. 🙂


This post covers the loop usage of bash shell. NOTE: read inline comments carefully 🙂

1. for loop

2. while loop

3. until loop

4. break & continue

There may be times when you’re in an inner loop but need to stop the outer loop. The break command includes a single command line parameter value: break n where n indicates the level of the loop to break out of. By default, n is 1, indicating to break out of the current loop. If you set n to a value of 2, the break command will stop the next level of the outer loop.

5. redirect & pipe

Finally, you can either pipe or redirect the output of a loop within your shell script.


When creating a shell script file, you must specify the shell you are using in the first line of the file. The format for this is:

In a normal shell script line, the pound sign(#) is used as a comment line. A comment line in a shell script isn’t processed by the shell. However, the first line of a shell script file is a special case, and the pound sign followed by the exclamation point tells the hell what shell to run the script under (yes, you can be using a bash shell and run your script using another shell).

2. Display

The echo command can display a simple text string if you add the string following the command.

The echo command uses either double or single quotes to delineate text strings. If you use them within your string, you need to use one type of quote within the text and the other type to delineate the string.

Notice that the environment variables in the echo commands are replaced by their current values when the script is run. Also notice that we were able to place the $USER system variable within the double quotation marks in the first string, and the shell script was still able to figure out what we meant.

You may also see variables referenced using the format ${variable}. The extra braces around the variable name are often used to help identify the variable name from the dollar sign.

User variables can be any text string of up to 20 letters, digits, or an underscore character. User variables are case sensitive, so the variable Var1 is different from the variable var1. This little rule often gets novice script programmers in trouble.

Values are assigned to user variables using an equal sign. No spaces can appear between the variable, the equal sign, and the value (another trouble spot for novices). Here are a few examples of assigning values to user variables.

The shell script automatically determines the data type used for the variable value. Variables defined within the shell script maintain their values throughout the life of the shell script but are deleted when the shell script completes.

Just like system variables, user variables can be referenced using the dollar sign. It’s important to remember that when referencing a variable value you use the dollar sign, but when referencing the variable to assign a value to it, you do not use the dollar sign.

The backtick allows you to assign the output of a shell command to a variable.

3. Redirect I/O

>: output redirect
>>: output redirect append data
<: input redirect
<<: inline input redirect

The inline input redirection symbol is the double less-than symbol (<<). Besides this symbol, you must specify a text marker that delineates the beginning and end of the data used for input. You can use any string value for the text marker, but it must be the same at the beginning of the data and the end of the data.

4. Math Expression

The expr command allowed the processing of equations from the command line. Note the spaces around operator is necessary. Escape character(backslash) is used to identify any characters that may be misinterpreted by the shell before being passed to the expr command.

Bash also provides a much easier way of performing mathematical equations. In bash, when assigning a mathematical value to a variable, you can enclose the mathematical equation using a dollar sign and square brackets ($[ operation ]).

The bash shell mathematical operators support only integer arithmetic. The most popular solution uses the built-in bash calculator, called bc.

5. Structured Commands

5.1 if/else

The bash shell if statement runs the command defined on the if line. If the exit status of the command is zero (the command completed successfully), the commands listed under the then section are executed. If the exit status of the command is anything else, the then commands aren’t executed, and the bash shell moves on to the next command in the script.

5.2 test

The test command provides a way to test different conditions in an if-then statement. If the condition listed in the test command evaluates to true, the test command exits with a zero exit status code, making the if-then statement behave in much the same way that if-then statements work in other programming languages. If the condition is false, the test command exits with a 1, which causes the if-then statement to fail.

*) Numeric Comparisons
Comparison Description
n1 -eq n2 Check if n1 is equal to n2.
n1 -ge n2 Check if n1 is greater than or equal to n2.
n1 -gt n2 Check if n1 is greater than n2.
n1 -le n2 Check if n1 is less than or equal to n2.
n1 -lt n2 Check if n1 is less than n2.
n1 -ne n2 Check if n1 is not equal to n2.

However, The test command wasn’t able to handle the floating-point value.
You may also notice usage of double parentheses. It provide advanced mathematical formulas for comparisons, no escape is needed in it:

Symbol Description
val++ Post-increment
val– Post-decrement
++val Pre-increment
–val Pre-decrement
! Logical negation
Bitwise negation
** Exponentiation
<< Left bitwise shift
>> Right bitwise shift
& Bitwise Boolean AND
| Bitwise Boolean OR
** Exponentiation
&& && Logical AND
|| Logical OR
*) String Comparisons
Comparison Description
str1 = str2 Check if str1 is the same as string str2.
str1 != str2 Check if str1 is not the same as str2.
str1 < str2 Check if str1 is less than str2.
str1 > str2 Check if str1 is greater than str2.
-n str1 Check if str1 has a length greater than zero.
-z str1 Check if str1 has a length of zero.

Trying to determine if one string is less than or greater than another is where things start getting tricky. There are two problems that often plague shell programmers when trying to use the greater-than or less-than features of the test command:
– The greater-than and less-than symbols must be escaped, or the shell will use them as redirection symbols, with the string values as filenames.
– The greater-than and less-than order is not the same as that used with the sort command.

The double bracketed expression uses the standard string comparison used in the test command. However, it provides an additional feature that the test command doesn’t, pattern matching. No escape is needed anymore.

Capitalized letters are treated as less than lowercase letters in the test command. However, when you put the same strings in a file and use the sort command, the lowercase letters appear first. This is due to the ordering technique each command uses. The test command uses standard ASCII ordering, using each character’s ASCII numeric value to determine the sort order. The sort command uses the sorting order defined for the system locale language settings. For the English language, the locale settings specify that lowercase letters appear before uppercase letters in sorted order.

While the BashFAQ said: As of bash 4.1, string comparisons using < or > respect the current locale when done in [[, but not in [ or test. In fact, [ and test have never used locale collating order even though past man pages said they did. Bash versions prior to 4.1 do not use locale collating order for [[ either. So you get opposite result when running on CentOS-5.7(bash-3.2) and Ubuntu-10.04(bash-4.1) with [[ operator. And bash-4.1 is consistent with sort command now.

5.3 case

Well, this is easy, just walk through the snippet:

All sample code are tested under CentOS-5.7 and Ubuntu-10.04.