Replacing the fans on a Poweredge 2800

Fans of a PowerEdge 2800

The fans of my PowerEdge 2800. Front the cpu fans (stock Delta left, replacement everflow right); at the back the 120mm case fans (left the Artic, right the stock Nidec)

My “new” PowerEdge was way too loud. Like, jumbo-loud. I’ve chosen to replace all the fans with quieter models, but oh boy, this turned out to be a major pain! As always, I didn’t follow the most important rule when tinkering: do stuff incrementally, and test after each step. Well, maybe I’ll learn someday.

For reference, here the fan models:

Funtion Dell Replacement
4x Memory/Disk, 120x120x32mm, PWM Nidec Beta V TA350DC, M34789-35 Arctic F9 PWM
2x PSU, 60x60x25mm, Tacho Nidec Beta V TA225DC, B34605-33 Akasa AK-192BKT-B
2x CPU, 60x60x35mm PWM Delta AFB0612EHE Everflow F126025BU

I’ve changed all the fans in one go and ended up with a system that didn’t boot anymore. It didn’t even get to the end of the BIOS initialization, it shut down directly. I gathered that it must be the power supply’s fans (else I would have had a warning message on the POST). I’ve checked and I realized that I bought some fans with a build-in thermal control; a sensor that slows down the fan rotation. Well, as I had them already I hacked them up to ignore the sensor (cracked open sensor casing, soldered two pins together; no resistance acts as if the sensor measures a very high temperature).

Connector for PowerEdge 2800 fans

The connector with the soldered cables of the everflow fan. Here, the yellow and blue cable on the connector has been switched.

Next problem was that all fans spun at full speed. Impressively noisy, even for a seasoned cluster admin as me (I could hear the vibration 2 floors below). The reason were swapped PWM and Tacho pins (as I’ve found out after trying for a night to install Dell OpenManage 6.4 on Ubuntu 11.04 64bit). I don’t know why, but I needed different PIN configurations that every other source I’ve found on the Internet. Here’s what I’ve used in the end (This is looking on the bottom of the connector):

| 2 | 1 |-+
+---+---+ |
| 4 | 3 |-+
Pin Funtion Dell Everflow Arctic
1 VCC red yellow red
2 Control/PWM blue blue blue
3 Sense/Tacho yellow green yellow
4 Ground black black black

So everything is exactly as in the datasheets, except the swapped Control and Sense pins. After I’ve swapped them, the server became quite quiet and I quite happy. :D Actually, my desktop is now more noisy. Argh! BTW, thanks a lot to Brent Ozar for his blog entry on making a power edge quieter.


Now I seem to have another problem: as reported by Brent Ozar and others

One problem shown above is that sometimes fans spin slow enough that they trigger Dell’s thresholds for slow-moving fans. Gotta figure out how to fix that for good one of these days.

Damn. Now I either have to find a way to make the fans a little bit faster or to change the threshold. See the follow-up post for more info on the threshold problem.

Update 2

I finally managed to adjust the critical fan thresholds by patching the BMC firmware! Here’s the howto. Additionally, I created a project page for my server.

Installing Dell OpenManage on Ubuntu 11.04 64bit

Installing Dell OpenManage 6.50 on Ubuntu 11.04 64bit (Meerkat? Olifant? damn names) was a pain. Mostly because Debian-based systems are only supported since recently and thus there is too much outdated info out there (mostly grisly hacks how to force the rpm-based install to work on your system).

Thus, here a very quick summary:

  1. Follow instructions on this page:
  2. Afterwards, in order to be able to authenticate, edit /etc/pam.d/omauth as follows:
    -auth required /lib32/security/ nullok
    -auth required /lib32/security/
    -account required /lib32/security/ nullok
    +auth required /lib/x86_64-linux-gnu/security/ nullok
    +auth required /lib/x86_64-linux-gnu/security/
    +account required /lib/x86_64-linux-gnu/security/ nullok

Surprisingly painless (if I would have had this information 5 hours earlier). Ah yes, don’t forget to open your firewall for port tcp 1311… :)

Update: I created a project page for my server.

Reflashing a semi-bricked Hitachi/LG XBox 360 dvd drive

This app will flash single sectors in the Hitachi/LG firmware using debug flash erase/program routines that exist in the 3120L firmware. The difference to SeventhSon’s or the usual flashers is that it utilizes code that isn’t located
in one of the sectors that are changed when applying a hacked firmware. So if something went wrong during flashing, there might be a wee chance that the debug code works and can still be used to fix your drive.

All of this is based on Kev’s/SeventhSon’s work. I simply put together two of his programs to create a flasher that uses the Hitachi debug commands to flash sectors in the drive’s flash. All credit goes to SeventhSon. I even copied this text from his site. :) Unfortunately, his site vanished, so there are none of his notes left.

Use at your own risk, this may break your 360, 360 DVD drive and/or PC if done improperly (or if I happen to have made any mistakes).

Note: This does not work in recovery mode. This tool only helps if your checksum was patched (reports ok), but flashing does not work due to overwritten upload/execute handlers. If you don’t know what I’m talking about, leave it be.

Warning: This app is an interim solution intended only for hackers who know what they are doing. It is very easy to kill your drive with this program. If you fail to update the firmware checksum before you power down the drive, you will break your drive. If you overwrite any of the upload and execution command handlers with broken code, you will break your drive. If you overwrite your flash entry point code, you will break your drive. If you overwrite the sector containing your AES key without backing up your key first, you will break your drive and will not be able to repair it. If you do not understand everything that I just said, then this app isn’t for you.


Flashing sectors in the Hitachi/LG drive’s firmware from a PC

In the following examples encrypted_fw.bin is a full firmware dump. It must be encrypted. Encrypt it after you make your changes and before you run the flasher. There are tools somewhere on the internet that allows you to de/encrypt firmware images. So, encrypted_fw.bin would be created like so (in Windows),

C:\> memdump_win e 12200 8 8000 firmware.bin

(then modify your target sector within firmware.bin)

C:\> FirmCrypt e firmware.bin encrypted_fw.bin

(then flash the modified sector as per the examples below)

You will almost certainly break your drive if you do not encrypt the firmware image before flashing.

The next argument is the address of the target sector in the MN103’s address space in hexadecimal. Just add 0x90000000 to the sector’s offset into the firmware dump to get this value. The final argument is the sector size. The 3120L erase/program routines support a few devices, some appear to have 8KB sectors (0x2000 bytes), others 4KB sectors (0x1000 bytes). My drive has a SST39SF020A flash device (with 4KB sectors). I’m not sure if any Hitachi-LG drives exist with a different chip, but my app supports them just in case. Make sure you specifiy the sector size in hexadecimal.

Please. Don’t use this app if you don’t understand all of the above.

Linux example:

$ ./debug_flashsec /dev/sdb ./encrypted_fw.bin 9003F000 1000

Windows example:

C:\> debug_flashsec_win e encrypted_fw.bin 9003F000 1000

Note that the drive does not restart after each sector flash, nor does it need to be restarted. So you can change multiple sectors in one sitting. For example, a typical session might look like this (in Windows)

C:\> debug_flashsec_win e encrypted_fw.bin 90006000 1000
C:\> debug_flashsec_win e encrypted_fw.bin 90010000 1000
C:\> debug_flashsec_win e encrypted_fw.bin 90027000 1000
C:\> debug_flashsec_win e encrypted_fw.bin 9003E000 1000

This example shows 3 sector changes followed by a final sector change to update the firmware with a new correct checksum value. This final checksum update flash must always be performed unless you actually want the checksum to fail the next time your drive powers up (which will leave you stranded in recovery mode, not a great place to be).