So many of you guys have heard of these ‘buffalo terastation’ nas devices, right? Its basically a little grey box with four 250 gig hard drives in it that you can raid0 to be a 1TB raid.
Well I hacked the shit out of one at work:)
We use(d) these ’snap appliances’ for backups, which are these silly little 1u boxes that have some hard drives and some weird embedded version of linux.
The one we use for disk to disk backups is basically dead on its feet. Bossman decided that for an interim fix untill we got our apple xraid in, we should hit frys and get this device so that we had a backup device of SOME SORT during the interim.
So we go to frys and pick this thing up. Upon plugging it in, we find that it doesnt have NFS support, only smb and ftp. That was lame. I found a wiki that showed me how to do a little hardware hacking and get console access to the device. It explains how to use a soldering iron to remove 3 0-ohm resistors, and solder on some jumpers.
Once that was complete, I had console access to the device, but no root. The wiki explained how to hack a file run every minute by cron, so it would create a sudoers file and let the admin user have root access. Once that was completed, I followed some more instructions on the wiki describing how to install ‘dropbear’ - an ssh server, and a user-space nfs server.
Once this was complete, I literally squealed with delight. My first hardware hack! And it *WORKS*! Fucking awesome. So the litmus test to see if it worked was to copy a dvd image of 4.2 gigs across to the device via nfs to demonstrate proof of concept. The file was truncated at 2 gigs even, and the error ‘file size limit exceeded’ was thrown. Poo - the xfs filesystem on the device only has support for files smaller than 2 gigs. This certainly wouldn’t do. We needed to back up nearly 800 gigs of files, and some of which include 82 gig full disk images.
In an attempt to hack some more, I tried to recreate the array using ext2 - but made a typo (typed /dev/hda1 instead of /dev/hda3) and ended up murdering the operating system partiton on the device.
After that about two days of attempting to recreate the OS partition ensued. The device itself is very limited as it runs linux kernel 2.4.20 off of flash memory, so it is not a complete operating system. I had to take the drives out of the device and use another OS to try to rebuild the raid.
There are four disks, and each disk has four partitions. Hda1 is the os, hda2 is swap, hda3 is the bulk of the space, and I didnt bother looking to see what hda4 was. all of the ‘1′ partitions across all four drives are a raid1 with zero spares (so basically four identical copies of the 220 meg os partition), the ‘3′ partitions are a software raid5. I attempted to rebuild these by plugging a good drive, and the murdered drive into a a dell gx240 workstation and boot into knoppix, then start the raid using mdadm and hope it started rebuilding itself. It didnt. I had to manually recreate the raid, but since the gx240 only has primary and secondary IDE channels, and I needed a cdrom for knoppix, I could only get 3 of the 4 drives plugged in. I was able to recreate the raid1, but upon plugging it back into the buffalo device, the boot sequence threw an ‘invalid superblock’ error and refused to mount the raid. It would drop me to this psedo-os, that left me as the user admin, and not root - with no authority to try to remount/recreate the raids. Meh, if only I could get to single user mode or something.
The next day Zack (one of the other admins) was kind enough to bring in two PCI ide controller cards, each with two channels. I promptly put those into the gx240 and began rebuilding the raid1. Once that was complete, we remembered that the xfs partition that was on the raid5 was the digital equivalent to ass, since it didn’t support anything larger than 2 gigs, and I deleted and recreated the raid using different block configurations. Upon doing this, mdadm was recreating the raid at 600k per second. The complete time was something like 6400 minutes (which is over 100 hours). We let this go for a couple hours and upon talking it over and some googling concluded that we would be better off simply leaving the drives in the gx240, putting a fifth drive in there for the OS, and creating a 1TB raid0 over debian, so we had *FULL* control over the environment - and in addtion to that we were pretty sure that the 2.4.20 kernel on that buffalo device didnt have KERNEL level support for files over 2gigs, and we didn’t want to wait 100 hours to find that out.
So I happily cleaned up the plane wreckage from my desk, the floor, and two chairs and began work on the gx240.
The gx240 is a simple workstation with 512 megs of ram, and a p4 2.4ghz processor. It has a gig nic which is nice (when we get a gig switch itll rule), but thats really about its only perk. My objective was simple - make a place for backups.
My first issue was power consuption. The power supply has only 5 plugs. Thats fine for operational use, but I need a cdrom just long enough to do the installation. My first idea was to take a mem stick, put a debian install iso on it, and boot to it. That didn’t work as the gx240 doesn’t support booting from USB. My next idea was to take one of those 40 dollar external ide hard drive enclosures and plug a cdrom into it. I wan’t sure if this would work, but I tried it anyway. My idea was to see if the gx240 could see the external macguyvered cdrom as a real cdrom and boot from it. That didn’t work - it made the workstation REALLY confused and it didn’t boot at all. (got past the post, got to where it SHOULD have been booting, and just sat there going >:B )
Failing that, I tried to use the power from the external chassis, and an internal ide channel (for those of you familiar with dell workstations, this one is one that opens up kind of like a clamshell) so this thing was half open and had all sorts of crap running in and out of it. It’s too bad I didn’t have my camera on me, it would have been hilarious to post some pics. That actually worked! I managed to get it up and booting, but I think I made a bit of a booboo. At the time, I got into the debian installer but it only saw three drives. Upon checking the dmesg log in another terminal, I saw that it was having some trouble with IRQs. I figured the other drives werent being seen because of the IRQ issues, but inretrospect, I should have continued into the installer to make absolutely sure. So I ended up taking the four drives offline and having ONLY the installation hard drive and the cdrom in the gx240.
When I finally got to this phase, the 120gig western digital hard drive that was in the workstation crapped out. It started making ticking noises (which are head resets). Very luckily I had just cannibalized a couple of netra x1’s, so I had two 40 gig seagate drives. I quickly put the 40 gig drive in, and continued the installation.
Once complete, I did a full system upgrade to the unstable tree and installed kernel 2.6.16-1-686 and its subsequent headers. I was happy.. FINALLY.. forward progress! I plugged the other four drives in, and rebooted.
It puked during boot and dropped me to ‘busybox’, which is a limited bash-like shell for fixing things. Upon running dmesg (and I have NO IDEA why it did this) hda became hdi. The primary hard drive on the primary IDE chain off the motherboard was being identified as HDI. For some reason the new kernel put the four drives on the PCI ide cards in front of the motherboard ide controllers. That was WEIRD. So I changed the /etc/fstab to reflect the new ‘drive letters’, and I changed /etc/grub/menu.lst and rebooted.
It came back up okay, and I was able to start creating the raid0. Upon creating the raid and the following mke2fs, the raid created was only 650 gig. Four drives, each 250 gig, raid0.. 650 gig?! somethings wrong.
I enlisted the help of a friend of mine who this also stumped. I dug around for about two hours untill I found in the dmesg log that it only thought hda1 had something like 170 gig and not 250.
dd if=/dev/zero of=/dev/hda count=5000 fixed that nice n quick. I dd’ed, then rebooted and it saw the full size. I created the raid0 which ended up being 917 gigs, and put the workstation in the server room. The gx240 series dells dont have case fans, so those 5 drives in there are going to heat up pretty good. I figured putting the thing in the server room would help at least a little.
I installed NFS, and after a little troubleshooting managed to get all of our servers nfs mounted to it and writing backups at 70-90 megabit across the network. There were existing scripts to do this but after running the one for our mail server (which has roughly 80-100 gigs of data to back up) I was only getting 10 gigs of data. I employed the use of tar pipes to compress the data using bzip2 on the fly, then uncompress it on the destination machine. I started them at approximately 10:30pm on friday night, and now it is saturday, 6pm. I’ve copied 467 gigs of data from three of our main servers, with a few smaller servers to go.
This whole ordeal spanned 7 business days in total and here is what I learned:
1) Buffalo nas 1tb appliances are great for people who want to store movies, mp3s, or other stuff that is smaller than 2 gigs per file. it is NOT good for backups.
2) The dell gx240 workstation may have irq issues with more than 5 IDE devices in it.
3) Dell didn’t support booting from usb untill AFTER the gx240
4) Plugging a cdrom into an external hard drive chassis and trying to boot from it doesn’t work
5) Tar pipes appear faster at moving data from one place to another when comparing against rsync
6) The people at my work are very very patient with me :) I owe a couple people some beers.
7) and on a COMPLETELY unrelated subject, I think you can put an IOS on the cisco 2900xl series that supports layer 3 switching!
*PHEW*
Next I get to see how fast I can get the data from that gx240 off of it, and onto our brand new apple xraid! :)

