Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
soc:2009:pravin:journal:fedora11bko [2009/07/04 16:05]
less1
soc:2009:pravin:journal:fedora11bko [2009/07/27 05:35] (current)
less1
Line 44: Line 44:
 I initially doubted function ''​do_live_overlay''​ but it is clean, ​ I initially doubted function ''​do_live_overlay''​ but it is clean, ​
 now, concentrating parent function ''​do_live_from_base_loop''​ function. now, concentrating parent function ''​do_live_from_base_loop''​ function.
 +<​code>​
 +mount -n -o ro,remount /​dev/​mapper/​live-rw /sysroot
 +</​code> ​
 +is giving above JBD related errors. But this is not fatal, it is continuing even after this
 +But it seems there might be problems in run-init script
 +
 +
 +===== solution =====
 +Finally, problem is solved, there was one more reference to ''​plymouth''​ that I had to comment out
 +Most of the plymouth references where options. ​ Means even the command fails, the execution will not stop
 +<​code>​
 +plymouth --show-splash || :
 +</​code>​
 +but there was one perticular reference, which was not made optional. ​ I dont know if it is intentional or error.
 +
 +may be, problem is in ''​run-init''​ script
 +
 +===== Next problem =====
 +The graphical mode and run level 3 are not booting. ​ There is some problem in sendmail daemon, which is causing segementation fault which just freezes the execution and I get kernel panic. ​ I will try with disabling sendmail and see if it works.\\
 +Here is the error
 +<​code>​
 +Starting Bluetooth services: ​                                [ OK ]
 +EXT-4-fs error (device dm-0): ext4_find_entry:​ reading #13191 offset 0
 +/​etc/​rc.d/​rc : line 100 : /​etc/​rc3.d/​S80sendmail:​ Input/​output error
 +</​code>​
 +and so on....
 +
 +=== solution ===
 +I disabled sendmail and tried. ​ It partially worked. There was some error which I could not see because of scrolling, but then it gave login prompt.
 +The problem is, when I press enter key, it takes it as ''​^M'',​ so I am not able to login :-(
 +
 +=== attempt 2 ===
 +Disabled selinux with ''​selinux=0''​ and tried again\\
 +I removed sendmail and other services which were related to S99 like firstboot, local.
 +but still I am getting error
 +<​code>​
 +EXT4-fs error (device dm-0): __ext4_get_inode_loc:​ unable to read inode block - inode=9642, block=827
 +</​code>​
 +
 +There is one warning regarding device dm-0 while boot time, I am not sure if that is relevant to this error, but I will anyway document it here
 +<​code>​
 +JDB: barrier-based sync failed on dm-0:8 - disabling barriers
 +</​code>​
 +
 +
 +and following are the logs from apache2 server which was serving this iso image.
 +<​code>​
 +$ cat /​var/​log/​apache2/​access.log | tail
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 4096 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 16384 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 32768 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 65536 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 131072 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 131072 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 4096 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 16384 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 32768 "​-"​ "​-"​
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 65536 "​-"​ "​-"​
 +</​code>​
 +I am not sure if they will make any sense, but I have put them for reference.
 +
 +===== Attempt-3 =====
 +I wanted to remove ''/​etc/​rc3.d/​S80sendmail'',​ but I am not able to delete it. It says it is read-only filesystem.
 +But when I run mount command, it shows the filesystem in ''​rw''​ mode.  Following are the command I tried.
 +<​code>​
 +# mount
 +/dev/root on / type ext4 (rw,​noatime)
 +proc on /proc type proc (rw)
 +/sys on /sys type sysfs (rw)
 +udev on /dev type tmpfs (rw,​mode=0755)
 +/dev/pts on /dev/pts type devpts (rw,​gid=5,​mode=620)
 +
 +# rm /​sysroot/​etc/​rc3.d/​S80sendmail
 +rm: cannot remove '/​sysroot/​etc/​rc3.d/​S80sendmail'​ : Read-only file system
 +
 +# mount / -o remount,rw
 +
 +# rm /​sysroot/​etc/​rc3.d/​S80sendmail
 +rm: cannot remove '/​sysroot/​etc/​rc3.d/​S80sendmail'​ : Read-only file system
 +
 +</​code>​
 +
 +====== What is working? ​ ======
 +Single user mode is working fine.  so user gets shell where he can do all he wants
 +
 +
 +====== Fedora 11 live over NFS ======
 +Trying to see if Fedora 11 live can boot over NFS.\\
 +The reasoning behind this experimentation is that, if it works over NFS then it may help in locating the problem.
 +
 +
 +==== Testing NFS setup ====
 +Exported the "/​var/​www"​ ( because it contains all the ISO images) over NFS.  Following is the excerpt from ''/​etc/​exports''​
 +<​code>​
 +/var/www *(ro,async)
 +</​code>​
 +This NFS volume does get mounted properly on local machine.
 +<​code>​
 +sudo mount 192.168.111.11:/​var/​www mpoint
 +$ mount
 +192.168.111.11:/​var/​www on /​home/​pravin/​Etherboot/​mpoint type nfs (rw,​addr=192.168.111.11)
 +</​code>​
 +
 +==== Testing the NFS mount from virtualization ====
 +Fedora 11 live cd was booted with virtualBox. The network was working and even the host machine was accissible as the URL http://​192.168.111.11 did correctly resolved to Host website.\\
 +But NFS mount failed with following error.
 +<​code>​
 +# mount 192.168.111.11:/​var/​www /​home/​liveuser/​mpoint/​
 +mount.nfs: access denied by server while mounting 192.168.111.11:/​var/​www
 +</​code>​
 +Why would this fail if mounting from localhost is working fine?
 +
 +Got help from rwrc, and fixed the problem.\\
 +It seems one more option has to be added to export options ''​insecure'',​ restating the explanation given by rwcr
 +<​code>​
 +rwcr: Try making it /var/www *(ro,​async,​insecure)
 +rwcr: Linux generally requires NFS requests to come from privileged ports, and the Fedora livecd might be using a nonstandard NFS mounter that doesn'​t do that.
 +</​code>​
 +
 +===== Next step : mount NFS partition from initramfs =====
 +Debian uses special program called ''​nfsmount''​ for NFS mounting at boot time, I will try out both. The mount command and nfsmount utility.\\
 +Also, the kernel module be needed.
 +Following modules and executable ''/​sbin/​mount.nfs''​ was needed for NFS to work
 +   - sunrpc.ko
 +   - lockd.ko
 +   - auth_rpcgss.ko
 +   - nfs_acl.ko
 +   - nfs.ko
 +In addition to this, I had to pass option ''​-o nolock''​ for mount to work without problems
 +<​code>​
 +mount "​${NFS_PATH} /iso -o nolock"​
 +mount /​iso/​Fedora-11-i686-Live.iso /sysroot -o loop -o ro
 +</​code>​
 +
 +NFS works fine in run level 3. also one can start the GUI with ''​startx''​ after logging in as ''​root''​ from multiuser prompt.\\
 +The only issue with NFS_Fedora is that ''​plymountd''​ still creates a problem and is disabled and it somehow stops GUI coming up automatically.\\
 +So, the user need to login in run-level 3 and then do the ''​startx''​
 +
 +
 +==== HTTPFS improvement ====
 +Some progress has been done on HTTPFS front also.  Till now, all the tests of booting over HTTPFS were done using ''​qemu''​ which is inherently slow as it is emulation.
 +When same tests were run on vmware, which is much faster, errors started comming after runlevel - 3 login prompt\\
 +Following are the errors, which are quite same as those errors which use to come around sendmail daemon before.
 +
 +<​code>​
 +# startx
 +-bash: /​usr/​bin/​startx:​ Input/​output error
 +
 +# top
 +top : error while loading shared libraries: /​lib/​libncursesw.so.5:​ cannot read file data: Input/​output error
 +
 +EXT4-fs error (device dm-0): ext4_find_entry:​ reading directory #10651 offset 0
 +</​code>​
 +With this observation,​ we can claim that errors are thrown because there is delay in response from fuse.  The ext4-fs is giving up because of this delay.\\
 +Now, I need to find a way to increase the tolerance for this delay.
 +
 +
 +===== Removing plymouth =====
 +As marc has suggested, remove ''​plymount''​ from original iso, and see if it works without plymount.
 +If it doesn'​t then blame can be surely put on ''​plymount''​ and not the network related complications. \\
 +
 +==== Modifying ISO ====
 +Now, the question is, how to add new initramfs into ISO and still keep it bootable?\\
 +From Remastering Knoppix Howto, following is the command which works for knoppix
 +<​code>​
 +mkisofs -pad -l -r -J -v -V "​KNOPPIX"​ -no-emul-boot -boot-load-size 4 \
 +   ​-boot-info-table -b boot/​isolinux/​isolinux.bin -c boot/​isolinux/​boot.cat \
 +   ​-hide-rr-moved -o /​mnt/​hda1/​knx/​knoppix.iso /​mnt/​hda1/​knx/​master
 +</​code>​
 +
 +and I need to modify it, so that it will work for fedora.
 +
 +<​code>​
 +mkisofs -pad -l -r -J -v -V "​Fedora-11-i686-Live"​ -no-emul-boot -boot-load-size 4 \
 +   ​-boot-info-table -b isolinux/​isolinux.bin -c isolinux/​boot.cat \
 +   ​-hide-rr-moved -o /​var/​www/​iso/​fedora_11.iso /​home/​pravin/​Etherboot/​git/​BKO.git/​pxeknife/​red_hat/​fedora_11_live_cd/​newfedora
 +</​code>​
 +===== running startx from single user mode =====
 +Tried an experiment of running ''​startx''​ from single user mode and see if it works.\\
 +Well it did not worked atall.
 +
 +===== Problem Found =====
 +With help of andyTim, the cause of problem has been located.\\
 +The ''​network''​ and ''​NetworkManager''​ do restart the networking which breaks the existing HTTPFS mount.
 +
 +
 +===== Solution =====
 +The temporary solution tried is delete both of following files
 +  - ''/​etc/​init.d/​network''​
 +  - ''/​etc/​init.d/​NetworkManager''​
 +So, user has to first boot into single user mode, delete above files,
 +and then boot into runlevel 5.
 +
  

QR Code
QR Code soc:2009:pravin:journal:fedora11bko (generated for current page)