DISCLAIMER ---------- Use these information at your own risk! I will not take any responsability of any kind if you are going to trash your system or your system becomes unuseable. With these paper I do not want to break any patent/trademark/whatever licence but i am not an expert in this, I am "only" a technician. In case something needs to be added/changed/removed please contact me immediatly via email (fabbione_at_fabbione.net) All trademarks are the property of their respective owners. Usage of trademarks does not constitute a challenge to their status or ownership. MOTIVATIONS ----------- I wish to underline that with this paper i only want to document the achivement of a technical challenge in the pure spirit of a pioneer, mainly due to my curiosity (and since some search engines didn't report any successful story about it). THANKS ------ SUN (http://www.sun.com/) for building such a nice piece of hardware, extremely well documented (http://docs.sun.com/) and exciting to play with. Ericsson Telebit A/S (http://www.tbit.dk/) for allowing me to use their resources outside working hours (yeah.. long nights without sleeping ;)) HP (http://www.hp.com/) for supporting, not only in words, the Debian community. (http://www.debian.org/partners/) togethers with all the others. Ben Collins for assisting me preparing a decent kernel All the teams around the Linux Kernel (http://www.kernel.org/) and the Debian Project (http://www.debian.org/) .. and amok .. my favourite irc bot ;) HOWTO ***** Here is the description of what i have done with what i had. I assume you are familiar with debian, linux kernel and SUN hardware. This kind of procedure is not meant to be used by script kiddies that want to look cool in front of their bosses. HARDWARE REQUIREMENTS --------------------- an e10k and its SSP are kinda mandatory. I was allowed to use one SB board equipped with 4 CPU's, 4 GB of RAM, 1 x 4 Fast and a QLOGIC scsi controller connected to a D1000 with a bunch of disks (2x9GB allocated for playing). The board also have a SOCAL FC controller but it was not used for the installation even if recognized by the kernel. Another sparc64 machine where to install linux. I used my personal own Netra T1 105 equipped with 2x9GB disks GENERAL PROBLEMS TO FACE ------------------------ - The e10k network console is not recognized 100% by linux so you will be able to see only a small portion of the boot process. Logs are still available in $SSPLOGGER//netcon, but they might be sometimes unreadable. (http://lists.debian.org/debian-sparc/2004/debian-sparc-200402/msg00196.html) - Kernel. In order to boot easily this machine you need at least kernel 2.6.4-rc1. Note: it seems that at this point in time 2.6.4-rc1 has a memory leak that hangs the TCP connections after some hours (Indipendently or not if there is load on the machine.. in mine there wasn't) and it is reproducible regularly. All other kernels have a bug that limit their size as described below: This was the major issue. In my setup any uncompressed kernel bigger than 2124344 Feb 28 11:32 vmlinuz-2.6.3-sparc64-smp.allin will not boot. Config is available here: http://people.debian.org/~fabbione/e10k/sparc64 Even adding the CONFIG_SUN_OPENPROMFS=y that results in: 2132536 Feb 28 11:32 vmlinuz-2.6.3-sparc64-smp will not boot (it will freeze during the boot process to be more precise). As starter I used Ben Collin's kernel images (http://www.phunnypharm.org/pub/for/sparc-folks/kernel-images-2.6/), but location might change. Stay tuned on debian-sparc mailing list where Ben usually posts news. (http://lists.debian.org/debian-sparc) - SSP doesn't like linux reboots (well .. it doesn't know anything about it so noone can be blamed for it) and due to this problem the procedure to make a clean reboot involves: halt (linux side), power -off && sleep 30 && power -on && bringup -A off (ssp side, note that -A off is not a case and neither is the sleep!) LUCKY WAY (untested but it should work without any problem) -------- If you are lucky enough that your e10k storage system uses the same disks as the one you have in the extra sparc64 box, the only thing you have to be worried is the kernel. Install the machine exactly as you would like to have the e10k and be sure that you can gain ssh access to that installation. Compile an appropriate kernel, be sure that it recognizes at least network cards and scsi controlles on the e10k, install it. Sticks the disks in the e10k storage system, configure the domain/obp, ask the obp to boot from it and at silo prompt specify the kernel image with -p (mandatory to see some output of the boot process on console) and you should be ok.. just wait for the boot process to complete and there you go.. ssh in... you are done. Pehaps you will have to fight a few times with the kernel size.. tought.. repeat until it works ;) and if you are not lucky keep reading here below.. UNLUCKY WAY (tftpboot and nfsroot) ----------- Execute a MINIMAL installation of Debian on the extra machine exactly as you would have done if the e10k was a normal sparc station. Be sure you can gain access to the installation via ssh. Clone the installation in a top level dir like /e10k and be sure to change the fstab to reflect a nfs root setup. You probably won't need to touch it anymore but for semplicity keeps a tar of it handy. Change the setup of the sparc64 main installation to not conflicts with the e10k one. Such as ip address and hostname. Compile a minimal kernel as described adding support for ROOT_NFS (it will require kernel auto config, and nfs client). Setup rarpd and tftpd to allow the e10k domain to boot using the new shiny kernel you just created. Export via nfs /e10k. From the domain/obp boot from the network with the full syntax as specified in the kernel documentation. I used something like: boot net -p root=/dev/nfs nfsroot=192.168.0.2:/e10k ip=192.168.0.72:192.168.0.2:192.168.0.1:255.255.254.0:netra:eth0 Repeat until you cannot ssh to the shiny e10k installation. (I seriously doubt someone will manage to boot at the first kernel attempt ;)) Note: i really suggest to run tcpdump while the domain is booting. If the console goes banana it's the only way to see if the boot process is hanging. Until you don't want to try to decode the mess inside $SSPLOGGR since the same chars that makes the console unreadable are sent to the $SSPLOGGER as well. After you will manage to ssh to the e10k installation everything becomes way to easy :-) Partition the disks assigned to the domain, activate the swap (if you don't have too much memory), mount the partition(s) somewhere and untar the e10k again into this directory (i am sure i wrote to keep it handy, didn't I?) Copy the booting kernel in it as well. mount /proc under the somewhere directory and chroot to it. Setup silo to boot with the kernel, remember to edit fstab to reflect your partition layout. Exit the chroot reboot (remember the notes above about rebooting!) Configure the obp to boot from the disk. At silo prompt specify the kernel image to use and remember to add the -p at least for the first times.. otherwise stick it in silo.conf You should be rocking now :-) AND NOW WHAT? ------------- Now we need to work on getting several things done. The first absolute priorities are: - Getting the console working as well. Relying on ssh isn't the best solution - Perhaps ask SUN to help us if they are interested in going further (DCS would be really cool!)