This is an old revision of the document!
====== Booting Fedora with BKO ====== ===== Where I am getting stuck ===== Following are the messages which are showen before getting stuck <code> Mounting proc filesystem Mounting sysfs filesystem Creating initial /dev Running plymouthd udev: starting version 141 </code> and It hangs after this point. I have bad feeling that it is not even reaching the step where it will execute /init script <code> squashfs: version 4.0 (2009/01/31) Phillip Lougher </code> Found the problem, ''plymouth'' is not working properly. I don't know the reason right now, but for time being, I have commented out the line which starts plymouth and it is progressing after that point. ===== Mounting iso over httpfs ===== Quite a few things are missing from fedora initrd, so I am adding them as and when I am encountering the errors related to them. - Added ''ifconfig'' and ''route'' by using ''busybox'' soft link. (Note : I am using knoppix busybox here, Not using fedora busybox because it is not statically linked ) - Adding fuse module, somehow it is missing, it should have been present by default in kernel. But may be my assumption is wrong about fuse that fuse will present in kernel by default. Finally, iso is mounted over httpfs. ===== Next set of problems ===== <code> /dev/root: error opening volume /dev/root: error opening volume JBD: barrier-based sync failed on dm-0:8 - disabling barriers transfering control to /sbin/init Bug in initramfs /init detected. Dropping to a shell. Good luck! </code> It seems, fedora wants something to be set for ''/dev/root'' and another thing that I need to find out is what is this ''JBD'' error about ''barrier-based sync failed'' ==== /dev/root ==== ''/dev/root: error opening volume'' is handled, I just looked for various places where ''/dev/root'' is used and wrapped it with ''if [-z ${HTTPFS}]'' \\ now I need to find out what is that JBD problem ==== JBD problem ==== I initially doubted function ''do_live_overlay'' but it is clean, now, concentrating parent function ''do_live_from_base_loop'' function. <code> mount -n -o ro,remount /dev/mapper/live-rw /sysroot </code> is giving above JBD related errors. But this is not fatal, it is continuing even after this But it seems there might be problems in run-init script ===== solution ===== Finally, problem is solved, there was one more reference to ''plymouth'' that I had to comment out Most of the plymouth references where options. Means even the command fails, the execution will not stop <code> plymouth --show-splash || : </code> but there was one perticular reference, which was not made optional. I dont know if it is intentional or error. may be, problem is in ''run-init'' script ===== Next problem ===== The graphical mode and run level 3 are not booting. There is some problem in sendmail daemon, which is causing segementation fault which just freezes the execution and I get kernel panic. I will try with disabling sendmail and see if it works.\\ Here is the error <code> Starting Bluetooth services: [ OK ] EXT-4-fs error (device dm-0): ext4_find_entry: reading #13191 offset 0 /etc/rc.d/rc : line 100 : /etc/rc3.d/S80sendmail: Input/output error </code> and so on.... === solution === I disabled sendmail and tried. It partially worked. There was some error which I could not see because of scrolling, but then it gave login prompt. The problem is, when I press enter key, it takes it as ''^M'', so I am not able to login :-( === attempt 2 === Disabled selinux with ''selinux=0'' and tried again\\ I removed sendmail and other services which were related to S99 like firstboot, local. but still I am getting error <code> EXT4-fs error (device dm-0): __ext4_get_inode_loc: unable to read inode block - inode=9642, block=827 </code> There is one warning regarding device dm-0 while boot time, I am not sure if that is relevant to this error, but I will anyway document it here <code> JDB: barrier-based sync failed on dm-0:8 - disabling barriers </code> and following are the logs from apache2 server which was serving this iso image. <code> $ cat /var/log/apache2/access.log | tail 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 4096 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 16384 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 32768 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 65536 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 131072 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 131072 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 4096 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 16384 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 32768 "-" "-" 192.168.0.1 - - [06/Jul/2009:23:56:27 +0200] "GET /Fedora-11-i686-Live.iso HTTP/1.1" 206 65536 "-" "-" </code> I am not sure if they will make any sense, but I have put them for reference. ===== Attempt-3 ===== I wanted to remove ''/etc/rc3.d/S80sendmail'', but I am not able to delete it. It says it is read-only filesystem. But when I run mount command, it shows the filesystem in ''rw'' mode. Following are the command I tried. <code> # mount /dev/root on / type ext4 (rw,noatime) proc on /proc type proc (rw) /sys on /sys type sysfs (rw) udev on /dev type tmpfs (rw,mode=0755) /dev/pts on /dev/pts type devpts (rw,gid=5,mode=620) # rm /sysroot/etc/rc3.d/S80sendmail rm: cannot remove '/sysroot/etc/rc3.d/S80sendmail' : Read-only file system # mount / -o remount,rw # rm /sysroot/etc/rc3.d/S80sendmail rm: cannot remove '/sysroot/etc/rc3.d/S80sendmail' : Read-only file system </code> ====== What is working? ====== Single user mode is working fine. so user gets shell where he can do all he wants ====== Fedora 11 live over NFS ====== Trying to see if Fedora 11 live can boot over NFS.\\ The reasoning behind this experimentation is that, if it works over NFS then it may help in locating the problem. ==== Testing NFS setup ==== Exported the "/var/www" ( because it contains all the ISO images) over NFS. Following is the excerpt from ''/etc/exports'' <code> /var/www *(ro,async) </code> This NFS volume does get mounted properly on local machine. <code> sudo mount 192.168.111.11:/var/www mpoint $ mount 192.168.111.11:/var/www on /home/pravin/Etherboot/mpoint type nfs (rw,addr=192.168.111.11) </code> ==== Testing the NFS mount from virtualization ==== Fedora 11 live cd was booted with virtualBox. The network was working and even the host machine was accissible as the URL http://192.168.111.11 did correctly resolved to Host website.\\ But NFS mount failed with following error. <code> # mount 192.168.111.11:/var/www /home/liveuser/mpoint/ mount.nfs: access denied by server while mounting 192.168.111.11:/var/www </code> Why would this fail if mounting from localhost is working fine? Got help from rwrc, and fixed the problem.\\ It seems one more option has to be added to export options ''insecure'', restating the explanation given by rwcr <code> rwcr: Try making it /var/www *(ro,async,insecure) rwcr: Linux generally requires NFS requests to come from privileged ports, and the Fedora livecd might be using a nonstandard NFS mounter that doesn't do that. </code> ===== Next step : mount NFS partition from initramfs ===== Debian uses special program called ''nfsmount'' for NFS mounting at boot time, I will try out both. The mount command and nfsmount utility.\\ Also, the kernel module be needed. Following modules and executable ''/sbin/mount.nfs'' was needed for NFS to work - sunrpc.ko - lockd.ko - auth_rpcgss.ko - nfs_acl.ko - nfs.ko In addition to this, I had to pass option ''-o nolock'' for mount to work without problems <code> mount "${NFS_PATH} /iso -o nolock" mount /iso/Fedora-11-i686-Live.iso /sysroot -o loop -o ro </code> NFS works fine in run level 3. also one can start the GUI with ''startx'' after logging in as ''root'' from multiuser prompt.\\ The only issue with NFS_Fedora is that ''plymountd'' still creates a problem and is disabled and it somehow stops GUI coming up automatically.\\ So, the user need to login in run-level 3 and then do the ''startx'' ==== HTTPFS improvement ==== Some progress has been done on HTTPFS front also. Till now, all the tests of booting over HTTPFS were done using ''qemu'' which is inherently slow as it is emulation. When same tests were run on vmware, which is much faster, errors started comming after runlevel - 3 login prompt\\ Following are the errors, which are quite same as those errors which use to come around sendmail daemon before. <code> # startx -bash: /usr/bin/startx: Input/output error # top top : error while loading shared libraries: /lib/libncursesw.so.5: cannot read file data: Input/output error EXT4-fs error (device dm-0): ext4_find_entry: reading directory #10651 offset 0 </code> With this observation, we can claim that errors are thrown because there is delay in response from fuse. The ext4-fs is giving up because of this delay.\\ Now, I need to find a way to increase the tolerance for this delay. ===== Removing plymouth ===== As marc has suggested, remove ''plymount'' from original iso, and see if it works without plymount. If it doesn't then blame can be surely put on ''plymount'' and not the network related complications. \\ ==== Modifying ISO ==== Now, the question is, how to add new initramfs into ISO and still keep it bootable?\\ From Remastering Knoppix Howto, following is the command which works for knoppix <code> mkisofs -pad -l -r -J -v -V "KNOPPIX" -no-emul-boot -boot-load-size 4 \ -boot-info-table -b boot/isolinux/isolinux.bin -c boot/isolinux/boot.cat \ -hide-rr-moved -o /mnt/hda1/knx/knoppix.iso /mnt/hda1/knx/master </code> and I need to modify it, so that it will work for fedora. <code> mkisofs -pad -l -r -J -v -V "Fedora-11-i686-Live" -no-emul-boot -boot-load-size 4 \ -boot-info-table -b isolinux/isolinux.bin -c isolinux/boot.cat \ -hide-rr-moved -o /var/www/iso/fedora_11.iso /home/pravin/Etherboot/git/BKO.git/pxeknife/red_hat/fedora_11_live_cd/newfedora </code> ===== running startx from single user mode ===== Tried an experiment of running startx from single user mode and see if it works.