Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
soc:2009:pravin:journal:fedora11bko [2009/07/04 15:53]
less1
soc:2009:pravin:journal:fedora11bko [2009/07/27 05:35] (current)
less1
Line 36: Line 36:
  
 ==== /dev/root ==== ==== /dev/root ====
-''/​dev/​root:​ error opening volume'' ​ is handled, now I need to find out what is that JBD problem+''/​dev/​root:​ error opening volume'' ​ is handled, ​I just looked for various places where ''/​dev/​root''​ is used and wrapped it with ''​if [-z ${HTTPFS}]''​ \\ 
 +now I need to find out what is that JBD problem 
 + 
 + 
 +==== JBD problem ==== 
 + 
 +I initially doubted function ''​do_live_overlay''​ but it is clean,  
 +now, concentrating parent function ''​do_live_from_base_loop''​ function. 
 +<​code>​ 
 +mount -n -o ro,remount /​dev/​mapper/​live-rw /sysroot 
 +</​code>​  
 +is giving above JBD related errors. But this is not fatal, it is continuing even after this 
 +But it seems there might be problems in run-init script 
 + 
 + 
 +===== solution ===== 
 +Finally, problem is solved, there was one more reference to ''​plymouth''​ that I had to comment out 
 +Most of the plymouth references where options. ​ Means even the command fails, the execution will not stop 
 +<​code>​ 
 +plymouth --show-splash || : 
 +</​code>​ 
 +but there was one perticular reference, which was not made optional. ​ I dont know if it is intentional or error. 
 + 
 +may be, problem is in ''​run-init''​ script 
 + 
 +===== Next problem ===== 
 +The graphical mode and run level 3 are not booting. ​ There is some problem in sendmail daemon, which is causing segementation fault which just freezes the execution and I get kernel panic. ​ I will try with disabling sendmail and see if it works.\\ 
 +Here is the error 
 +<​code>​ 
 +Starting Bluetooth services: ​                                [ OK ] 
 +EXT-4-fs error (device dm-0): ext4_find_entry:​ reading #13191 offset 0 
 +/​etc/​rc.d/​rc : line 100 : /​etc/​rc3.d/​S80sendmail:​ Input/​output error 
 +</​code>​ 
 +and so on.... 
 + 
 +=== solution === 
 +I disabled sendmail and tried. ​ It partially worked. There was some error which I could not see because of scrolling, but then it gave login prompt. 
 +The problem is, when I press enter key, it takes it as ''​^M'',​ so I am not able to login :-( 
 + 
 +=== attempt 2 === 
 +Disabled selinux with ''​selinux=0''​ and tried again\\ 
 +I removed sendmail and other services which were related to S99 like firstboot, local. 
 +but still I am getting error 
 +<​code>​ 
 +EXT4-fs error (device dm-0): __ext4_get_inode_loc:​ unable to read inode block - inode=9642, block=827 
 +</​code>​ 
 + 
 +There is one warning regarding device dm-0 while boot time, I am not sure if that is relevant to this error, but I will anyway document it here 
 +<​code>​ 
 +JDB: barrier-based sync failed on dm-0:8 - disabling barriers 
 +</​code>​ 
 + 
 + 
 +and following are the logs from apache2 server which was serving this iso image. 
 +<​code>​ 
 +$ cat /​var/​log/​apache2/​access.log | tail 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 4096 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 16384 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 32768 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 65536 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 131072 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 131072 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 4096 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 16384 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 32768 "​-"​ "​-"​ 
 +192.168.0.1 - - [06/​Jul/​2009:​23:​56:​27 +0200] "GET /​Fedora-11-i686-Live.iso HTTP/​1.1"​ 206 65536 "​-"​ "​-"​ 
 +</​code>​ 
 +I am not sure if they will make any sense, but I have put them for reference. 
 + 
 +===== Attempt-3 ===== 
 +I wanted to remove ''/​etc/​rc3.d/​S80sendmail'',​ but I am not able to delete it. It says it is read-only filesystem. 
 +But when I run mount command, it shows the filesystem in ''​rw''​ mode.  Following are the command I tried. 
 +<​code>​ 
 +# mount 
 +/dev/root on / type ext4 (rw,​noatime) 
 +proc on /proc type proc (rw) 
 +/sys on /sys type sysfs (rw) 
 +udev on /dev type tmpfs (rw,​mode=0755) 
 +/dev/pts on /dev/pts type devpts (rw,​gid=5,​mode=620) 
 + 
 +# rm /​sysroot/​etc/​rc3.d/​S80sendmail 
 +rm: cannot remove '/​sysroot/​etc/​rc3.d/​S80sendmail'​ : Read-only file system 
 + 
 +# mount / -o remount,​rw 
 + 
 +# rm /​sysroot/​etc/​rc3.d/​S80sendmail 
 +rm: cannot remove '/​sysroot/​etc/​rc3.d/​S80sendmail'​ : Read-only file system 
 + 
 +</​code>​ 
 + 
 +====== What is working? ​ ====== 
 +Single user mode is working fine.  so user gets shell where he can do all he wants 
 + 
 + 
 +====== Fedora 11 live over NFS ====== 
 +Trying to see if Fedora 11 live can boot over NFS.\\ 
 +The reasoning behind this experimentation is that, if it works over NFS then it may help in locating the problem. 
 + 
 + 
 +==== Testing NFS setup ==== 
 +Exported the "/​var/​www"​ ( because it contains all the ISO images) over NFS.  Following is the excerpt from ''/​etc/​exports''​ 
 +<​code>​ 
 +/var/www *(ro,​async) 
 +</​code>​ 
 +This NFS volume does get mounted properly on local machine. 
 +<​code>​ 
 +sudo mount 192.168.111.11:/​var/​www mpoint 
 +$ mount 
 +192.168.111.11:/​var/​www on /​home/​pravin/​Etherboot/​mpoint type nfs (rw,​addr=192.168.111.11) 
 +</​code>​ 
 + 
 +==== Testing the NFS mount from virtualization ==== 
 +Fedora 11 live cd was booted with virtualBox. The network was working and even the host machine was accissible as the URL http://​192.168.111.11 did correctly resolved to Host website.\\ 
 +But NFS mount failed with following error. 
 +<​code>​ 
 +# mount 192.168.111.11:/​var/​www /​home/​liveuser/​mpoint/​ 
 +mount.nfs: access denied by server while mounting 192.168.111.11:/​var/​www 
 +</​code>​ 
 +Why would this fail if mounting from localhost is working fine? 
 + 
 +Got help from rwrc, and fixed the problem.\\ 
 +It seems one more option has to be added to export options ''​insecure'',​ restating the explanation given by rwcr 
 +<​code>​ 
 +rwcr: Try making it /var/www *(ro,​async,​insecure) 
 +rwcr: Linux generally requires NFS requests to come from privileged ports, and the Fedora livecd might be using a nonstandard NFS mounter that doesn'​t do that. 
 +</​code>​ 
 + 
 +===== Next step : mount NFS partition from initramfs ===== 
 +Debian uses special program called ''​nfsmount''​ for NFS mounting at boot time, I will try out both. The mount command and nfsmount utility.\\ 
 +Also, the kernel module be needed. 
 +Following modules and executable ''/​sbin/​mount.nfs''​ was needed for NFS to work 
 +   - sunrpc.ko 
 +   - lockd.ko 
 +   - auth_rpcgss.ko 
 +   - nfs_acl.ko 
 +   - nfs.ko 
 +In addition to this, I had to pass option ''​-o nolock''​ for mount to work without problems 
 +<​code>​ 
 +mount "​${NFS_PATH} /iso -o nolock"​ 
 +mount /​iso/​Fedora-11-i686-Live.iso /sysroot -o loop -o ro 
 +</​code>​ 
 + 
 +NFS works fine in run level 3. also one can start the GUI with ''​startx''​ after logging in as ''​root''​ from multiuser prompt.\\ 
 +The only issue with NFS_Fedora is that ''​plymountd''​ still creates a problem and is disabled and it somehow stops GUI coming up automatically.\\ 
 +So, the user need to login in run-level 3 and then do the ''​startx''​ 
 + 
 + 
 +==== HTTPFS improvement ==== 
 +Some progress has been done on HTTPFS front also.  Till now, all the tests of booting over HTTPFS were done using ''​qemu''​ which is inherently slow as it is emulation. 
 +When same tests were run on vmware, which is much faster, errors started comming after runlevel - 3 login prompt\\ 
 +Following are the errors, which are quite same as those errors which use to come around sendmail daemon before. 
 + 
 +<​code>​ 
 +# startx 
 +-bash: /​usr/​bin/​startx:​ Input/​output error 
 + 
 +# top 
 +top : error while loading shared libraries: /​lib/​libncursesw.so.5:​ cannot read file data: Input/​output error 
 + 
 +EXT4-fs error (device dm-0): ext4_find_entry:​ reading directory #10651 offset 0 
 +</​code>​ 
 +With this observation,​ we can claim that errors are thrown because there is delay in response from fuse.  The ext4-fs is giving up because of this delay.\\ 
 +Now, I need to find a way to increase the tolerance for this delay. 
 + 
 + 
 +===== Removing plymouth ===== 
 +As marc has suggested, remove ''​plymount''​ from original iso, and see if it works without plymount. 
 +If it doesn'​t then blame can be surely put on ''​plymount''​ and not the network related complications. \\ 
 + 
 +==== Modifying ISO ==== 
 +Now, the question is, how to add new initramfs into ISO and still keep it bootable?​\\ 
 +From Remastering Knoppix Howto, following is the command which works for knoppix 
 +<​code>​ 
 +mkisofs -pad -l -r -J -v -V "​KNOPPIX"​ -no-emul-boot -boot-load-size 4 \ 
 +   ​-boot-info-table -b boot/​isolinux/​isolinux.bin -c boot/​isolinux/​boot.cat \ 
 +   ​-hide-rr-moved -o /​mnt/​hda1/​knx/​knoppix.iso /​mnt/​hda1/​knx/​master 
 +</​code>​ 
 + 
 +and I need to modify it, so that it will work for fedora. 
 + 
 +<​code>​ 
 +mkisofs -pad -l -r -J -v -V "​Fedora-11-i686-Live"​ -no-emul-boot -boot-load-size 4 \ 
 +   ​-boot-info-table -b isolinux/​isolinux.bin -c isolinux/​boot.cat \ 
 +   ​-hide-rr-moved -o /​var/​www/​iso/​fedora_11.iso /​home/​pravin/​Etherboot/​git/​BKO.git/​pxeknife/​red_hat/​fedora_11_live_cd/​newfedora 
 +</​code>​ 
 +===== running startx from single user mode ===== 
 +Tried an experiment of running ''​startx''​ from single user mode and see if it works.\\ 
 +Well it did not worked atall. 
 + 
 +===== Problem Found ===== 
 +With help of andyTim, the cause of problem has been located.\\ 
 +The ''​network''​ and ''​NetworkManager''​ do restart the networking which breaks the existing HTTPFS mount. 
 + 
 + 
 +===== Solution ===== 
 +The temporary solution tried is delete both of following files 
 +  - ''/​etc/​init.d/​network''​ 
 +  - ''/​etc/​init.d/​NetworkManager''​ 
 +So, user has to first boot into single user mode, delete above files, 
 +and then boot into runlevel 5. 
  

QR Code
QR Code soc:2009:pravin:journal:fedora11bko (generated for current page)