Projects :: PXE

When running clusters where all servers perform exactly the same task, one might want to keep the same configuration on all servers and pull data from a shared source such as an NFS webserver. Intel has created a specific BIOS they include in their Ethernet Express desktop and server network adapters called PILA's. This BIOS is called the Pre Execute Environment, or PXE for short. PXE can use BOOTP to retrieve IP information from a network server and retrieve and execute a file using TFTP. We are aiming at booting either FreeBSD or Linux. In order to do that, we need to have a bootloader (like lilo or grub) which can fetch a kernel and an initial ramdisk (initrd) from TFTP. FreeBSD has this bootloader per default, for Linux you can use Syslinux to get the job done. I hear grub also has such functionality, but I did not succeed in using it. In your network, all you need is a DHCP server, a TFTP bootserver and an RSYNC repository. All of these are readily available for just about any unix flavored operating system. I have created an environment which lets the PXE network card interact with a bootserver in order to boot and run an entirely diskless server. I made a standard initrd which contains enough utilities to feel like a real server, but without all of the bloat and cruft you normally find on your box. We can fit an entire system in approximagely 21MB of space, an ideal size for a PXE booted server. To create a specific server using this standard initrd, we can have it rsync files into the root filesystem.

FreeBSD 4 - Howto

This document serves as a step by step guide on how to make a PXE boot environment for use in largescale FreeBSD deployments. The same thing is also possible for at least Linux, NetBSD and OpenBSD. This document aims specifically at FreeBSD 4.x. For detailed information on the different subsections of this howto, please refer to the sections to your left.

0. Prerequisites

You should have in-depth knowlegde of your operating system. You need at least two servers. The first server, called bootserver, runs ISC DHCPd, has a TFTP server and runs RSYNC in servermode. We make several assumptions:

1. Enabling tftpd

1. FreeBSD's standard tftp server works just fine. Add:
tftp   dgram  udp   wait  root  /usr/libexec/tftpd   tftpd -l -s /usr/local/pxe/tftpboot/
to /etc/inetd.conf. Make sure your server starts inetd by adding inetd_enable=YES; to /etc/rc.conf.
2. mkdir -p /usr/local/pxe/tftpboot/
3. Start /usr/sbin/inetd.

3. Building FreeBSD's PXE bootloader

1. In /etc/make.conf, add LOADER_TFTP_SUPPORT=YES; This enables pxeboot(8) to retrieve the kernel via TFTP, it's default is via NFS.
2. cd /usr/src/sys/boot; make clean; make; This makes pxeboot(8) with TFTP support
3. cp /usr/obj/usr/src/sys/boot/i386/pxeldr/pxeboot /usr/local/pxe/tftpboot/

4. Configuring DHCP

In /usr/local/pxe/dhcpd.conf, add:
option rsync-path code 194 = text;

host pxe-test {
hardware ethernet 00:02:b3:4c:ed:c8;
option domain-name "";
option host-name "pxe-test";
filename "pxeboot";
option rsync-path "";
You now have a DHCP server which will offer the PXE server the pxeboot(8) loader you just created in step 3.

5. Create an RSYNC module called 'test'

1. Install RSYNC; pkg_add -vr rsync
2. In /usr/local/pxe/rsyncd.conf, add:
use chroot = yes
max connections = 16
pid file = /var/run/
gid = wheel
uid = root

path = /usr/local/pxe/rsync/test/
comment = PXE root repository for
read only = yes
3. Create the module directory; mkdir -p /usr/local/pxe/rsync/test/
4. Populate the RSYNC module with files you want on your PXE server. In particular, make sure you have the following files: /etc/rc.d/rc.inet1, /etc/rc.d/rc.inet2 and /etc/rc.d/rc.local. Note that these files follow the Slackware filename scheme.
5. Start RSYNC /usr/local/bin/rsync -4 --daemon --config=/usr/local/etc/rsyncd.conf

I often run services in DJB daemontools, to ensure that the software gets restarted in the event of a server or software crash. Your mileage may vary.

6. Build a PXE image:

1. cvs checkout pxe
2. cd pxe
3. ./
4. gzip -c initrd > /usr/local/pxe/tftpboot/
I keep all sorts of sizes laying around. The image is currently some 22MB in size, but sometimes I need quite a lot of extra stuff onboard, so I make varying filesystem sizes. As you can see, making larger ramdisks does not take much more space once you gzip them:
$ ls -l /usr/local/pxe/tftpboot/initrd*.gz
-rw-r--r-- 1 root wheel 7871137 Feb 9 13:28 initrd_4.10-p5-128M.gz
-rw-r--r-- 1 root wheel 7704303 Jan 7 16:25 initrd_4.10-p5-32M.gz
-rw-r--r-- 1 root wheel 7714334 Jan 5 10:30 initrd_4.10-p5-40M.gz
-rw-r--r-- 1 root wheel 7726349 Jan 5 10:30 initrd_4.10-p5-48M.gz
-rw-r--r-- 1 root wheel 7765873 Feb 11 13:15 initrd_4.10-p5-64M.gz

7. Build a kernel for the PXE server

1. Edit your kernel configuration in /usr/src/sys/i386/conf/PXE. Strip IDE, SCSI and other stuff you don't have from it. GENERIC will do fine, by the way.
2. cd /usr/src; make buildkernel KERNCONF=PXE
3. Copy it into place (gzipping it): gzip -c /usr/src/sys/i386/compile/PXE/kernel > /usr/local/pxe/tftpboot/
4. mkdir -p /usr/local/pxe/tftpboot/boot
5. cp /boot/loader /boot/boot[12] /usr/local/pxe/tftpboot/boot
6. Create /usr/local/pxe/tftpboot/boot/loader.rc with these contents:
echo Configuring for IP ${boot.netif.ip}
echo Loading kernel...
load ${boot.netif.ip}/kernel
echo Loading root filesystem...
load -t mfs_root ${boot.netif.ip}/initrd
echo Booting...
set vfs.root.mountfrom="ufs:/dev/md0c"
When the PXE BIOS receives the DHCP reply, that reply will contain a filename to load. It will load pxeboot(8) via TFTP, which in turn will load /boot/loader and /boot/loader.rc from the TFTP server. The file loader.rc will then have the pxeboot(8) loader fetch / and then / It then instructs the pxeboot loader to try to find its root filesystem from the initrd. Note that gzipping the files saves a lot of TFTP'ing; the pxeboot(8) loader supports unzipping and will always attempt to find .gz files first. 8. Test!
Basically, the following will happen when you turn on your PXE server:
DHCP Server config:
We define a specific option (number 194) which we declare to be a string. In DHCP terms, this results in the following configuration line somewhere at the top of your dhcpd.conf:
option rsync-path code 194 = text;
host pxe-test {
hardware ethernet 00:01:80:57:5A:40;
option domain-name "";
option host-name "pxe-test";
filename "pxeboot";
option rsync-path "";
The clause above looks mostly like a normal static hostname declaration. We include host-name, domain-name and a specific BOOTP option called next-server. The PXE bios will use this IP address to fetch the argument to filename, and will start executing that filename.
dhclient configuration in the image
The client side is contained within the PXE bootimage (initrd.gz). In the image, we have /etc/dhclient.conf, which also defines the rsync-path dhcp option and tells the dhclient program to request this information from the server:
option rsync-path code 194 = text; 
request host-name, domain-name, ntp-servers, rsync-path;
With these two lines in /etc/dhclient.conf, your PXE server will know how to find its post-boot files, using the line 'option rsync-path "";' in the bootservers dhcpd.conf.

The initrd aims to create an environment which is the same regardless of operating system choice. To that end, I shuffeled binary locations around a bit, mostly resembling the FreeBSD filesystem layout. We have the following binaries:
/bin: df ln rm echo ls rmdir cat ed mkdir chmod expr mv sleep
cp grep ping stty csh hostname ping6 sync date kill ps test dd
ksh pwd bash

/sbin: dhclient init mount_nfs route dhclient-script mount_procfs
rtsol dmesg mountd swapon fdisk ldconfig newfs sysctl fsck md5 tunefs
halt mknod nologin umount ifconfig mount swapctl mount_mfs reboot

/usr/bin: awk bitkeys chflags crontab cut du env envdir envuidgid fghack
find fstat ftp grep gzip id less logger login more multilog netstat
nfsstat rpcinfo rsync scp sed setlock setuidgid sftp showmount
softlimit ssh ssh-add ssh-agent ssh-keygen ssh-keyscan supervise svc
svok svscan svstat systat tai64n tai64nlocal tail tar telnet top touch
uname uptime vi view vmstat w wc ex wget who passwd su which sort
uniq printf cmp head xargs egrep fgrep zegrep zgrep compress gunzip
gzcat uncompress zcat ex view nawk slogin

/usr/sbin: arp chown cron dev_mkdb inetd iostat ntpd ntpdate
pwd_mkdb rtsold sshd syslogd tcpdump traceroute traceroute6 vipw;
There's a CVS repository I maintain which contains a shell script which tries to create an initrd with the proper files in their proper locations. Of course different operating systems (and different distributions based upon the Linux kernel) tend to have binaries in different places. If a binary is not present (ie swapctl for Linux), it is not copied to the image. /etc is sparsely populated. We only have critical files such as:
dhclient.conf  ntp.conf   protocols    services   ssh     
crontab hosts profile rc.d shells
termcap fstab group master.passwd
rc.firewall ttys auth.conf gettytab

login.conf rc rc.firewall6
Note that the root-user has a preset password, only known to me. It seems like a good idea to override the initrd image with local configuration files in your RSYNC repository.
Disclaimer: This page was created for your convenience. I do not offer support on PXE servers at all. If these pages aren't enough to get you going, please use Google or other means of information. Do not bother me with questions. I know for certain this stuff works, BIT runs at least 15 production servers with this PXE image.