Synopsis: Need to move and resize an EXT3 partition.
System Configuration: Gentoo Linux x86_64 2.6.20
Raid-1 Mirror with 60GB of unallocated space at start of drive and a 100GB EXT3 partition.
Brief Backstory: After setting up my media PC with Windows Vista but before my extended business trip to the Philippines, I decided to convert my desktop to a Linux server. When I first purchased this machine over 3 years ago (see historical blog postings for details), I had an extra 40GB hard drive installed that I performed an install of 64-bit Gentoo Linux on. Well, 3 years later, I decided to ditch Windows and use Linux on this machine exclusively. This required a somewhat painful upgrade process that I have yet to finish; I managed to update the kernel and some system utilities before stopping.
Anyhow, one of the changes I made to my system after using Linux exclusively was picking up a second 160GB SATA hard drive so that I could have some redundancy for my data. Now that I have returned from the Philippines and have also accumulated another 200GB of TV content, I am feeling the need to A) add a lot more capacity to my server and B) add a backup medium (aka large external hard drive) that I can rsync to. But in the meantime, I wanted to test the server to see how well it would hold up and also allow me access to my data while I was gone. The Linux server proved to be a success; it ran for over 215 days without a crash. Eventually I had to reboot it to prepare for the partition move and resize, but the Linux system is my most stable between the 3 OSs I currently use daily (Windows XP/Vista, Mac OS X, and Linux).
So, having proven to myself that Linux will be a fine replacement for my desktop OS and will function very well for my future server (and being only 1GB from filling the 100GB data/music partition), I decided to bite the bullet and try to move and resize my EXT3 data partition. I rarely had problems moving NTFS and FAT partitions in the past, so this should be a snap! I have much of my music backed up at mp3tunes.com and a burned CD of my data folder from the past year or so which should be good enough. (In retrospect, I should have done one final backup of everything before proceeding, but more on that later.)
I referenced this guide to start. I would definitely come back to this guide in the future if I need to expand an EXT partition to the right (or shrink a partition), but it was a good reference nonetheless. I used the following command (from Page 2) to convert my EXT3 partition to EXT2 (i.e. remove the journal) and performed an fsck/e2fsck to check the filesystem before proceeding.
fsck -n /dev/sda1
tune2fs -O ^has_journal /dev/sda1
e2fsck -n /dev/sda1
I rebooting my computer and loaded Paragon Partition Manager v5 since it was able to recognize my RAID and the EXT3 (now EXT2) partition. Paragon reported no warnings or errors moving the partition to the beginning of the drive and resizing the partition. However, after booting into Linux again and trying to perform a file system check, e2fsck reported the following warning messages repeatedly (acting on different Groups and Blocks):
Pass 1: Checking inodes, blocks, and sizes
Group 0's inode table at 13 conflicts with some other fs block.
Illegal block #13310 (4294967295) in inode 7. IGNORED.
I used the following command to tell e2fsck to relocate these filesystem structures, providing more console output and status updates. Without the -C 0 and -v, I found that e2fsck would run for days without ending; only when I added "-C 0" (which displays the progress bar) did I notice that e2fsck would loop forever (due to the failure described in the next paragraph).
e2fsck -C 0 -y -v /dev/sda1
After the above warning messages disappear, once e2fsck appears to finally be performing the relocations, the following error message appears repeatedly (acting on different Groups and Blocks):
Error allocating 512 contiguous block(s) in block group 729 for inode table: Could not allocate block in ext2 filesystem
After displaying the last Error, e2fsck will start over again, asking to relocate the same filesystem structures again.
At this point, I found myself fairly frustrated; if only I had broken the RAID mirror and/or rsync'd my partition before this had happened, I could have either copied from one drive to the other and rebuilt the RAID or created a new filesystem and copied the data back. But at the same time, this was a good chance to me to learn more about EXT2 filesystems. I did enjoy that 500-level OS class I took at UW-Madison a few years back, and I am using Linux...
(I apologize for missing any links that I have visited while researching this problem; I will do my best to reference links at the end of this article.)
It was time to update e2fsprogs. Gentoo makes this really easy. I'm not sure what version I had been using, but for the remainder of my work I have been using v1.40.8. Regardless of whether you are using Gentoo or not, I advise downloading the tarball of e2fsprogs, because manually compiling and modifying the source code was ultimately how I was able to restore my filesystem.
I started by using the tool dumpe2fs, which provided me with the superblock information along with detected groups, etc.
If that command completed its dump successfully, you'll know that either your primary or one of your backup superblocks is not corrupted. It will also provide you with useful information such as block size and blocks/group, which will be important when you need to calculate the location on disk to perform dumps of your raw disk.
Next, I used the tool debugfs, which comes with e2fsprogs. This is a very useful tool because it allows you to perform normal filesystem access commands against your raw drive without mounting. I exclusively used 'ls' (and 'cd' when I could), but you could even create a directory if you wanted. Run the tool using the following command:
Once the debugfs console becomes available, type '?' or 'help' for a command list or just jump to 'ls'. When I tried 'ls', I received the following error:
EXT2 directory corrupted
I could not 'cd' or 'ls' other directories because apparently the root inode was corrupt! So debugfs wasn't much help at this point.
The next tool I stumbled upon was lde, or Linux Disk Editor. This tool needs to be installed seperately; it allows for raw access of the partition so that you can look up Blocks, Inodes, etc. LDE has a basic graphical interface, which also helps. Use the following command to run lde:
The primary superblock is located 1024 bytes from the start of the drive in Group 0, Block 0 (if block size > 1024) or Block 1 (if block size = 1024). Type 'B' to enter the block view mode, then scroll down until you reach 0x400. To change blocks, use '#0', replacing 0 with the block number you desire. Check out this site for the superblock structure definition; based on the dumpe2fs output, you should be able to compare values and verify that you are looking at the superblock in lde.
Although the root inode block number can be extracted by following the superblock information, I decided to go ahead and start adding debug printf statements to debugfs.c. Once I found the block number for the root inode, I went to the block in lde and was able to view the directory listings without any trouble. But debugfs continued to have problems.
I also tried the following command to force e2fsck to byte-swap the data before trying to access the partition.
e2fsck -S /dev/sda1
This produced the same "EXT2 directory corrupted" error message that I received when performing an 'ls' with debugfs. I started searching Google for information about byte-swapping an EXT2 filesystem. I discovered in a forum and by reading e2fsprogs change logs that e2fsprogs for PowerPC actually supported both little-endian and big-endian filesystems and had a way to distinguish between the two. However, at some point Intel proclaimed that all filesystems needed to be little-endian, at which point the PowerPC community followed by disabling big-endian support. The x86-compiled e2fsprogs always only supported little-endian, but provided the "-s" or "-S" flags to byte-swap the partitions.
So I decided to try enabling the byte-swap in debugfs by temporarily defining do_swap=1 in the ext2fs_read_dir_block2 function found in lib/ext2fs/dirblock.c. I initially enabled it without any logical conditions, which meant that every directory block that needed to be read was byte-swapped. Anyways, I found that I was able to perform an 'ls' using debugfs after byte-swapping the root inode. However, every other directory inode was not byte-swapped, so I used the following line of code to only byte-swap the root inode:
if(block == 1539) do_swap=1;
Where "1539" was the root inode block number.
I was then able to access all directories with debugfs :) The final step was to save the root inode after byte-swapping the inode structure elements. To do this, I decided to use an internal write function to perform this operation when I typed 'ls' once in debugfs. There may be a better, safer way to do this, but I knew that I only needed to run this once and quit, so this is the way I accomplished this. After the while loop but before the return in ext2fs_read_dir_block2, I added the following lines of code:
if(block == 1539)
printf("\nwriting to disk\n");
retval = io_channel_write_blk(fs->io, block, 1, buf);
Where "1539" was my root inode block number.
After compiling this in, run debugfs with the '-w' flag to allow debugfs to modify the filesystem.
sudo ./debugfs -w /dev/sda1
Run 'ls' once and quit. Don't run 'ls' more than once or you'll byte-swap again. Remove the above lines, remove the do_swap=1 line, recompile and run debugfs. 'ls' should now be able to work with all directory inodes.
If you made it this far, great! Go ahead and mount the filesystem, immediately performing an rsync dump or copy to an extra drive!
I hope this blog proves useful for people with corrupted filesystems as a result of EXT2/3 partition moving with a commercial partition manager. Feel free to leave comments and I will try to help you out as much as I can. This was a great learning experience, but next time I hope to do it with data that can safely scrapped :/
Links (in no particular order):
For those of you unable to recover your filesystem, try the free Windows tool below: