How to adjust the fan thresholds of a Dell PowerEdge

Adjusted lower critical thresholds for the fans of a PowerEdge 2800

Note: you MUST change your fans against slower, quieter ones to reduce the noise. The threshold adjustment discussed in this article only allows you to do so – without new fans, it’s useless!

Intro

In order to swap the fans on a Dell PowerEdge with slower, more quiet ones you have to adjust the lower critical threshold (LCR). If you don’t, the server’s firmware actually lowers the fan’s speed under it’s own LCR, panics, spins them back up a 100%, lowers them again etc. Very noisy, very annoying.

Previous, related posts:

This behavior is controlled by the BMC, an embedded management controller. You can configure many parameters of the BMC using the IPMI protocol. Unfortunately, the BMC’s firmware of a Dell PowerEdge does not allow to change the thresholds mentioned above. I contacted Dell support, and they refused to change the thresholds for such an old server.

So I had no choice but to change them myself. It took me quite a while to isolate the proper setting in the BMC’s firmware, the checksums etc. But I managed, and the server’s running now very quiet with adjusted thresholds.

Below, I explain how to adjust these thresholds with a python script I wrote. Note that you’ll need Python 2.6 in order to run the script. In case someone is interested I can also write up how I did it, but this is for another post.

Update: I created a project page for my server.

The result

First, here’s the result: my PowerEdge 2800 with swapped fans and patched fan thresholds.

This has been recorded with my laptop, 10cm/4in in front of the server. The system is now more silent than my desktop!

Prerequisites

I assume that you have a sufficiently recent Linux distribution up and running, with python installed and IPMI set up. If you don’t, have a look at this article that explains how to get a recent Ubuntu version running (without installing anything on your harddisk!).

Adjusting the fan thresholds

I assume that you have now FreeIPMI installed, the BMC configured and that you can query the BMC using IPMI.

  1. Query the sensors
    First, you have to query the sensors of your server using IPMI. The output should look a bit like this:

    you@server$ ipmi-sensors
    1: Temp (Temperature): NA (NA/125.00): [NA]
    2: Temp (Temperature): NA (NA/125.00): [NA]
    3: Ambient Temp (Temperature): NA (3.00/47.00): [NA]
    4: Planar Temp (Temperature): NA (3.00/72.00): [NA]
    5: Riser Temp (Temperature): NA (3.00/62.00): [NA]
    6: Temp (Temperature): NA (NA/NA): [NA]
    7: Temp (Temperature): NA (NA/NA): [NA]
    8: Temp (Temperature): 71.00 C (NA/125.00): [OK]
    9: Temp (Temperature): NA (NA/125.00): [NA]
    10: Ambient Temp (Temperature): 27.00 C (3.00/47.00): [OK]
    11: Planar Temp (Temperature): 46.00 C (3.00/72.00): [OK]
    12: Riser Temp (Temperature): 50.00 C (3.00/62.00): [OK]
    13: Temp (Temperature): NA (NA/NA): [NA]
    14: Temp (Temperature): NA (NA/NA): [NA]
    15: CMOS Battery (Voltage): NA (2.64/NA): [NA]
    16: ROMB Battery (Voltage): [NA]
    17: VCORE (Voltage): [State Deasserted]
    18: VCORE (Voltage): [NA]
    19: PROC VTT (Voltage): [State Deasserted]
    20: 1.5V PG (Voltage): [State Deasserted]
    21: 1.8V PG (Voltage): [State Deasserted]
    22: 3.3V PG (Voltage): [State Deasserted]
    23: 5V PG (Voltage): [State Deasserted]
    24: 5V Riser PG (Voltage): [State Deasserted]
    25: Riser PG (Voltage): [State Deasserted]
    26: CMOS Battery (Voltage): 3.11 V (2.64/NA): [OK]
    27: Presence  (Entity Presence): [Entity Present]
    28: Presence  (Entity Presence): [Entity Absent]
    29: Presence  (Entity Presence): [Entity Present]
    30: Presence  (Entity Presence): [Entity Absent]
    31: ROMB Presence (Entity Presence): [Entity Present]
    32: FAN 1 RPM (Fan): NA (1575.00/NA): [NA]
    33: FAN 2 RPM (Fan): NA (1575.00/NA): [NA]
    34: FAN 3 RPM (Fan): NA (1575.00/NA): [NA]
    35: FAN 4 RPM (Fan): NA (1575.00/NA): [NA]
    36: FAN 5 RPM (Fan): NA (1575.00/NA): [NA]
    37: FAN 6 RPM (Fan): NA (1575.00/NA): [NA]
    38: FAN 1 RPM (Fan): NA (2025.00/NA): [NA]
    39: FAN 2 RPM (Fan): NA (2025.00/NA): [NA]
    40: FAN 3 RPM (Fan): 4875.00 RPM (2025.00/NA): [OK]
    41: FAN 4 RPM (Fan): 4800.00 RPM (2025.00/NA): [OK]
    42: FAN 5 RPM (Fan): 1800.00 RPM (900.00/NA): [OK]
    43: FAN 6 RPM (Fan): 1950.00 RPM (900.00/NA): [OK]
    44: FAN 7 RPM (Fan): 1875.00 RPM (900.00/NA): [OK]
    45: FAN 8 RPM (Fan): 1875.00 RPM (900.00/NA): [OK]
    46: Status  (Processor): [Processor Presence detected]
    47: Status  (Processor): [NA]
    48: Status  (Power Supply): [Presence detected]
    49: Status  (Power Supply): [NA]
    50: VRM  (Power Supply): [Presence detected]
    51: VRM  (Power Supply): [Presence detected]
    52: OS Watchdog (Watchdog 2): [OK]
    53: SEL (Event Logging Disabled): [Unknown]
    54: Intrusion (Physical Security): [OK]
    55: PS Redundancy (Power Supply): [NA]
    56: Fan Redundancy (Fan): [Fully Redundant]
    73: SCSI Connector A (Cable/Interconnect): [NA]
    74: SCSI Connector B (Cable/Interconnect): [NA]
    75: SCSI Connector A (Cable/Interconnect): [NA]
    76: Drive (Slot/Connector): [NA]
    77: Drive (Slot/Connector): [NA]
    78: 1x2 Drive (Slot/Connector): [NA]
    79: Secondary (Module/Board): [NA]
    80: ECC Corr Err (Memory): [Unknown]
    81: ECC Uncorr Err (Memory): [Unknown]
    82: I/O Channel Chk (Critical Interrupt): [Unknown]
    83: PCI Parity Err (Critical Interrupt): [Unknown]
    84: PCI System Err (Critical Interrupt): [Unknown]
    85: SBE Log Disabled (Event Logging Disabled): [Unknown]
    86: Logging Disabled (Event Logging Disabled): [Unknown]
    87: Unknown (System Event): [Unknown]
    88: CPU Protocol Err (Processor): [Unknown]
    89: CPU Bus PERR (Processor): [Unknown]
    90: CPU Init Err (Processor): [Unknown]
    91: CPU Machine Chk (Processor): [Unknown]
    92: Memory Spared (Memory): [Unknown]
    93: Memory Mirrored (Memory): [Unknown]
    94: Memory RAID (Memory): [Unknown]
    95: Memory Added (Memory): [Unknown]
    96: Memory Removed (Memory): [Unknown]
    97: PCIE Fatal Err (Critical Interrupt): [Unknown]
    98: Chipset Err (Critical Interrupt): [Unknown]
    99: Err Reg Pointer (OEM Reserved): [Unknown]

    You have to note the part about the fans (d’oh). Record sensor numbers, fan names and thresholds (the value in brackets). You’ll need it later to identify your system.

  2. Download the latest BMC firmware
    Got to http://support.dell.com/support/downloads/ and get the latest BMC firmware for your system. Select any Linux OS; the BMC firmware should be listed under something like Embedded Server Management. On the download page, select the .BIN package. In my case the file was called BMC_FRMW_LX_R223079.BIN. Download it!

  3. Fix and extract .BIN package
    In my case the .BIN package did not properly work. I had to fix it first, and then extract it. For this, open a terminal and go to the folder you’ve downloaded the package to.

    Then execute:

    you@server$ sed -i 's/#!\/bin\/sh/#!\/bin\/bash/' BMC_FRMW_LX_R223079.BIN  # fix interpreter bug
    you@server$ chmod 755 BMC_FRMW_LX_R223079.BIN                              # make executable
    you@server$ sudo mkdir bmc_firmware                                        # create dir as root
    you@server$ sudo ./BMC_FRMW_LX_R223079.BIN --extract bmc_firmware          # yes, you have to do this as root! :(
    you@server$ cd bmc_firmware

    This should extract your firmware. Check that you have a file called extracted/payload/bmcflsh.dat. If not, game over, your system isn’t compatible. If yes, yay!

  4. Patch firmware
    Next, download the program I wrote for patching the firmware. Then, use the program on the firmware as shown below:

    you@server$ wget https://raw.github.com/arnuschky/dell-bmc-firmware/master/adjust-fan-thresholds/dell-adjust-fan-thresholds.py
    you@server$ chmod 755 dell-adjust-fan-thresholds.py
    you@server$ ./dell-adjust-fan-thresholds.py payload/bmcflsh.dat

    The program is a python (version >= 2.6) script, that first lets you choose a system from the ones available in the firmware and the adjust the fan thresholds of this system. Yes, there can be support for multiple systems in a single firmware. You recorded the fan values before? Now you know why: you have to use it to identify your system from the ones the script shows to you. Just use the number of fans, their names and thresholds to identify your system. Maybe you’re lucky and the system name has already been found and is directly displayed.

    In the next step you can select fans and change their threshold. Just remember that the result is a multiple of 75. Half the usual speed has proven to be a good value. I’ve never tested what happened if you set it to 0, but this would be quite stupid as you can’t detect broken fans.

    If the program display a code at the end and asks you to report back, please do so! That way we can identify the other systems using their code (for example, the code of a PowerEdge 2800 is “K_C”).

  5. Flash firmware
    Finally, flash the firmware like as shown below.

    Disclaimer: I am not responsible for any damage you do to your system! If you flash this firmware, you might render your PowerEdge server unusable. It might even be unrecoverable. Additionally, badly set thresholds might cause overheating.

    Additionally, use the usual caution when flashing (do not interrrupt power, do not flash other a network link, do not be stupid).

    you@server$ LD_LIBRARY_PATH=./hapi/opt/dell/dup/lib:$LD_LIBRARY_PATH ./bmcfl32l -i=payload/bmcflsh.dat -f

    Cross your fingers. The flasher should accept the firmware. If not and it complains about the CRC, something went wrong. Don’t worry if the fans speed up fully and go dead afterwards during the flash, that’s normal. The system should stabilize afterwards. There is not need to reboot.

  6. Check the sensors
    Check that everything is in order:

    you@server$ ipmi-sensors

    That’s it. Enjoy your silent PowerEdge!

Trivia

Some things that I learned while messing with the firmware:

  • There can be multiple systems per firmware
  • Generally it’s quite well engineered
  • I’ve found Dell’s default password root/calvin. What is the 444444 for?
  • Dell server systems seem to be named internally after cities. BER, LOND, OSLO etc are easy enough to guess. But what the hell is K_C??? (my system)
  • The firmware package is probably the most horrible over-engineered script I’ve ever met on Linux
  • Dell uses CRC-16 for checksum – two different algorithms in the same firmware!

Update 1: I created a project page for my server.

Update 2: I wrote this article that explains how to get a recent Ubuntu version running (without installing anything on your harddisk!). This is for all the Windows users out there!

Update 3: I moved the code of this project into a GitHub repository: http://projects.nuschkys.net/2012/04/06/how-to-get-ubuntu-live-running/ GitHub is great because people can easily collaborate, fork, submit issues and patches and so on.

Please don’t ask me basic Linux questions! Google is your friend. If you don’t know what you are doing, you shouldn’t be doing it as you might damage your server!

Python/psycopg2/PostgeSQL: script for bulk inserts using COPY with progress indicator

Woah, what a title. :) I needed a script for inserting bulk data into a PostgreSQL database. Actually, I had a script already, written in Perl, and it was so slow that I needed a better and faster replacement. As I am slowly replacing all my Bash/Perl scripts with Python-pedants I aimed at doing the same here.

I decided to use psycopg2 for a Python-PostgreSQL binding. The copy_from method proved to be very fast; exactly what I needed. BUT I also needed a progress indicator. And while I’ve found a some people out there looking for exactly the same thing, I couldn’t find a solution. So here’s my script for doing this:

#!/usr/bin/python
import psycopg2
import sys
import os

class ReadFileProgress:

  def __init__(self, filename):
    self.datafile = open(filename)
    self.totalRecords = 0
    self.totalBytes = os.stat(filename).st_size
    self.readBytes = 0

    # skip header line
    self.datafile.readline()
    # count records
    for i, l in enumerate(self.datafile):
      pass
    self.totalRecords = i + 1
    sys.stderr.write("Number of records: %d\n" % (self.totalRecords))
    # rewind
    self.datafile.seek(0)
    # skip header line
    self.datafile.readline()
    # start progress
    self.perc5 = self.totalBytes / 20.0
    self.perc5count = 0
    self.lastPerc5 = 0
    sys.stderr.write("Writing records: 0%")

  # count bytes and display progress while doing so
  def countBytes(self, size=0):
    self.readBytes += size
    if (self.readBytes - self.lastPerc5 >= self.perc5):
      self.lastPerc5 = self.readBytes

      if (int(self.readBytes / self.perc5) == 5):
        sys.stderr.write("25%")
      elif (int(self.readBytes / self.perc5) == 10):
        sys.stderr.write("50%")
      elif (int(self.readBytes / self.perc5) == 15):
        sys.stderr.write("75%")
      else:
        sys.stderr.write(".")

      sys.stderr.flush()

  def readline(self, size=None):
    countBytes(size)
    return self.datafile.readline(size)
 
  def read(self, size=None):
    self.countBytes(size)
    return self.datafile.read(size)

  def close(self):
    sys.stderr.write("100%\n")
    self.datafile.close()

def main():
  config = dict()
  config['tablename']="tablename"
  config['filename']="filename"
  config['connstring'] = "host='?' dbname='?' user='?' password='?'"
  config['droptable'] = False
  config['createtable'] = False
  config['rowsdef'] = "id serial PRIMARY KEY, number integer NOT NULL"
  config['filecolumns'] = ['id','number']

  try:
    # get a connection, if a connect cannot be made an exception will be raised here
    conn = psycopg2.connect(config['connstring'])
    # conn.cursor will return a cursor object, you can use this cursor to perform queries
    cursor = conn.cursor()

    # drop table if requested (and it exists)
    cursor.execute("SELECT * FROM information_schema.tables WHERE table_name=%s", (config['tablename'],))
    if (config['droptable'] and bool(cursor.rowcount)):
      cursor.execute("DROP TABLE "+config['tablename']+";")

    # create the table if requested
    if (config['createtable'] and not bool(cursor.rowcount)):
      cursor.execute("CREATE TABLE "+config['tablename']+" ("+config['rowsdef']+");")

    # create a fileprogress object and copy the data to the database
    datafile=ReadFileProgress(config['filename'])
    cursor.copy_from(file=datafile, table=config['tablename'], sep='\t', null='\N', size=8192, columns=config['filecolumns'])
    datafile.close()

    # commit and clsoe
    cursor.close()
    conn.commit()

    sys.stdout.write("Transaction finished successfully.\n")

  except:
    exceptionType, exceptionValue, exceptionTraceback = sys.exc_info()
    sys.exit("Database connection failed!\n ->%s" % (exceptionValue))


if __name__ == "__main__":
  sys.exit(main())

Basically, the progress counter is a wrapper around the file object. It simply outputs the percentage of bytes read. The main program is very straight forward. The script was written with an experiment-specific config file parsing which I do not include here. Thus, you have to find your own way to set the config variables in the beginning of the main program.

SATA cable hack for the XBox 360

This guide describes an easy method how to make your XBox360 DVD drive accessible from the outside so that you can easily reflash it whenever you want without going through the pain of unbuilding your XBox each time. Ah, and you’ll loose your warranty.

I guess others did this before, but I haven’t found anything on the web, so I figured I describe how to do it.

You need:

  • tools/knowledge to open your XBox360 – I assume that you can do that, if not, search on the web
  • 1x SATA cable 90° angle, ~20cm
  • 1x SATA extension cable, ~20cm
  • tools to cut metal (e.g., a Dremel)
  • 1 zip-tie
Step

Disconnect the existing SATA cable from the DVD and the mainboard. Remove the DVD drive. It might be hard to find a longer cable with the proper angled connector, so here’s a closeup of it. It took me 3 tries to get the right one, so here are the DeLOCK model numbers:

Step

Search for a good position to cut a hole into the casing and mark it. Protect the rest of your XBox with paper that you tape into position. In the end, only the soon-to-be-hole should be visible; all the rest should be under a thick layer of paper.

Step

Take your Dremel and cut a hole into the XBox as marked, carefully avoiding to damage any of the components. Make the whole big enough that the two SATA cables can fit through.

Or, in my case: take your el-cheapo Dremel, start it up, notice the smell, realize the thing is going into meltdown, curse, burn yourself, unplug the thing, wait for the smoke to clear, open it up, yep, it actually melted, order a new one, be too impatient to wait, take out the drill and pipe wrench, and go ahead mistreat your XBox.

Step

Carefully remove all the metal dust with a vaccum cleaner; remove the paper. End up with a nice and clean hole. Or in my case the worst exectured casemod ever.

Step

Connect the angled SATA connector to the mainboard (just under the DVD drive, see first picture). Then, fiddle the cable through the hole. Fiddle the SATA extension cable through the hole as well, with the extension connector on the outside (d’oh). Plug the SATA connector into the DVD drive and put the drive back into your XBox. Connect the two SATA cables on the outside. It should now look like the image above.

Step

Attach the SATA cables to the case using a zip-tie. This prevents the you from pulling the cable off the DVD drive when messing with the connectors on the outside. Use a strong wirecutter to cut a hole in the outer plastic casing. Reassemble your XBox.

You’re done! If you want to flash your drive, just connect a normal SATA cable from the extension plug to your PC (remember to have your XBox switched on and the video connector plugged in!)

How-to repair a pizza oven

Our pizza oven broken yesterday. Well, it broke last time we used it, but we didn’t realize before yesterday night (when we had the pizzas already prepared). I started to repair the oven directly yesterday night (it’s quite simple and cheap to do actually), and as our guests were quite interested in the process I figured I post the description here.

Pizza oven "Gala Pizza Pronto"

Oven description: it’s a very cheap “Gala Pizza Pronto” oven with a stone bed. It’s genious for making pizzas yourself; the pizzas turn out to be much like a “real” pizza (and not either dry or overly soft like in a normal oven, and they have a proper bottom).

German disclaimer: Hier geht es um die Reparatur von einem Pizzaofen. Ich beschreibe die Reparatur auf Englisch. Wenn es aber Fragen geben sollte, bin ich natürlich gerne Bereit diese auch auf Deutsch zu beantworten!

French disclaimer: Ici une description comment reparer un four pizza. La description en bas est en englais. Néanmoins, je veux bien repondre à tout les question qui s’est pose en Français!

Step

Open the casing

Cheap bitset with lots of varieties of bits

Most suppliers use special screws for safety (to keep people like us from messing with the interiour). While we’re at it: DISCONNECT THE POWER CORD before opening the casing. If you don’t, well, that would maybe be a Darwin award, but I felt anyways obliged to put this warning here. In my case, the oven had tri-wing screws (see picture).

For repair work, two things are really important to have:

  1. a multimeter (I assume that you know how to use one, if not check for a guide on the internet, e.g., at Sparkfun
  2. a bitset with may different types (see picture, mine cost 10 bucks and has like 8 different bit types)
Step

Trace circuit

Try to trace the circuit of your oven. The circuit of the cheap ovens (like mine) is very simple: it consists of a temperature regulator (marked “poti” in the drawing), the 2 heating elements and a thermal fuse. See high-tech drawing above. Pizza goes between the two heating elements (I forgot to draw that).

Follow all the cables and check if they are not broken, disconnected or similar (simple) problems. Check that you have connectivity between the wall plug and the ovens interior, somebody might have broken the cord (i.e., by tripping over it).

All ok? I thought so. Next step!

Step

Check the heating elements

Next, we verify that the heating elements didn’t burn-out. Heating elements are super simple: they have a certain (small) resistance and can sustain high currents without burning. This means you can apply some voltage to them and they heat up. Thus, we can check them: if they have a infinite or zero resistance, they are broken. In that case you can trash the oven (unless you manage to find a replacement of course). If they work, they should show a small resistance as shown in the picture (26 Ohms sound about right).

Step

Check the fuse

If the heating elements are ok, it’s time to check the fuse. It’s usually attached close to the inner casing of the oven, and hidden in a layer of non-conductive heat-resitant tubing. The picture shows what it looks like. Test the fuse, if there’s no conductivity it’s broken.

Step

Replace the fuse

To replace the fuse, read the model number of the fuse. In my case, it said:

MICROTEMP
STGAHP
G4A00
TF 240C

Checking on the web, I’ve found the producer and the datasheet. With the datasheet, we can decipher the model number: It’s a 10A, 250V fuse for 240° Celsius.

These are quite common, I’ve found this part at the local electronics store for 2 bucks. Replace the fuse and voila, your oven should work again!

A note on replacing the fuse: as you can’t solder or use anything with plastic, the connections in such an oven are usually crimped. If you can’t manage to open the existing crimp-connector or have the part available, you can replace it with the metal part from a screw terminal (see picture). Just remember to strip off the plastic!

Long time no see

I have been away for a while as I attended a Permaculture Design Course (PDC). For those among you that do not know, Permaculture is basically the art of hacking natural ecosystems in order to make them run without our constant intervention while still being productive.

Although many talks where related to gardening and agriculture, there was substantial amount of time devotes to sustainable construction: talks about alternative energy, eco-construction, water treatment and similar topics. If I ever get the time, I will post some of the things I saw/tinkered with.

For now, check out Hack a Day’s latest feature run, sustainable hacks!

The battle againt the BMC – Part 2

Update 4/11/2011: I managed to find out some more info about the packaging scheme Dell uses for their BMC firmware files. I deciphered most of the container format. I am in the process of testing modifications right now, but for the moment I updated the version of the tool below with a new version. You can also download the program directly here: dell-extract-bmc-firmware.tar.gz.

Firmware header

Deciphering the firmware header.

As mentioned earlier, I started to look into hacking the BMC firmware in order to solve my problem of the hard-coded failure thresholds of my PowerEdge 2800.

I had a look into the firmware flash file, and noticed that it seems to consist of several files (as usual for BIOS/firmwares). As this might increase my chances not to brick my BMC, I decided to I separate the individual files for starters. I couldn’t find a program that does that (the firmware tools of the Dell linux community are closed-source, unfortunately), so I grabbed a hex-editor and deciphered (more or less) the firmware’s header. Here’s the corresponding C program:

// vim: ts=4 ai noexpandtab nopaste
/**
 * This program can extract and check the different files contained in a firmware file
 * for a Dell PowerEdge BMC.
 */

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <unistd.h>
#include <arpa/inet.h>

typedef struct
{
    uint8_t     hex02;
    uint8_t     numBlocks;  // number of subfiles in system
    uint32_t    filesize;
    uint16_t    zero;
    char        dellHeaderStr[9];
} header_t;

typedef struct
{
    uint8_t     zero1;
    uint8_t     type;       // 0x000b -> SD_${system}.FLC
    uint8_t     zero2;
    uint8_t     system;     // 0, 1, 2
    uint8_t     zeros[3];
    uint16_t    unknownFixedData;
    uint16_t    crc16;
    uint32_t    length;
    uint32_t    offset;
    char        filename[32];
} flc_block_t;
// 4x1+3x1+2+2+4+4+32=51

uint16_t endian_swap16(uint16_t x)
{
    return (x>>8) |
           (x<<8);
}

uint32_t endian_swap32(uint32_t x)
{
    return (x>>24) |
            ((x<<8) & 0x00FF0000) |
            ((x>>8) & 0x0000FF00) |
            (x<<24);
}

/** CRC table for the CRC-16. The poly is 0x8005 (x^16 + x^15 + x^2 + 1) */
uint16_t const crc16_table[256] = {
        0x0000, 0xC0C1, 0xC181, 0x0140, 0xC301, 0x03C0, 0x0280, 0xC241,
        0xC601, 0x06C0, 0x0780, 0xC741, 0x0500, 0xC5C1, 0xC481, 0x0440,
        0xCC01, 0x0CC0, 0x0D80, 0xCD41, 0x0F00, 0xCFC1, 0xCE81, 0x0E40,
        0x0A00, 0xCAC1, 0xCB81, 0x0B40, 0xC901, 0x09C0, 0x0880, 0xC841,
        0xD801, 0x18C0, 0x1980, 0xD941, 0x1B00, 0xDBC1, 0xDA81, 0x1A40,
        0x1E00, 0xDEC1, 0xDF81, 0x1F40, 0xDD01, 0x1DC0, 0x1C80, 0xDC41,
        0x1400, 0xD4C1, 0xD581, 0x1540, 0xD701, 0x17C0, 0x1680, 0xD641,
        0xD201, 0x12C0, 0x1380, 0xD341, 0x1100, 0xD1C1, 0xD081, 0x1040,
        0xF001, 0x30C0, 0x3180, 0xF141, 0x3300, 0xF3C1, 0xF281, 0x3240,
        0x3600, 0xF6C1, 0xF781, 0x3740, 0xF501, 0x35C0, 0x3480, 0xF441,
        0x3C00, 0xFCC1, 0xFD81, 0x3D40, 0xFF01, 0x3FC0, 0x3E80, 0xFE41,
        0xFA01, 0x3AC0, 0x3B80, 0xFB41, 0x3900, 0xF9C1, 0xF881, 0x3840,
        0x2800, 0xE8C1, 0xE981, 0x2940, 0xEB01, 0x2BC0, 0x2A80, 0xEA41,
        0xEE01, 0x2EC0, 0x2F80, 0xEF41, 0x2D00, 0xEDC1, 0xEC81, 0x2C40,
        0xE401, 0x24C0, 0x2580, 0xE541, 0x2700, 0xE7C1, 0xE681, 0x2640,
        0x2200, 0xE2C1, 0xE381, 0x2340, 0xE101, 0x21C0, 0x2080, 0xE041,
        0xA001, 0x60C0, 0x6180, 0xA141, 0x6300, 0xA3C1, 0xA281, 0x6240,
        0x6600, 0xA6C1, 0xA781, 0x6740, 0xA501, 0x65C0, 0x6480, 0xA441,
        0x6C00, 0xACC1, 0xAD81, 0x6D40, 0xAF01, 0x6FC0, 0x6E80, 0xAE41,
        0xAA01, 0x6AC0, 0x6B80, 0xAB41, 0x6900, 0xA9C1, 0xA881, 0x6840,
        0x7800, 0xB8C1, 0xB981, 0x7940, 0xBB01, 0x7BC0, 0x7A80, 0xBA41,
        0xBE01, 0x7EC0, 0x7F80, 0xBF41, 0x7D00, 0xBDC1, 0xBC81, 0x7C40,
        0xB401, 0x74C0, 0x7580, 0xB541, 0x7700, 0xB7C1, 0xB681, 0x7640,
        0x7200, 0xB2C1, 0xB381, 0x7340, 0xB101, 0x71C0, 0x7080, 0xB041,
        0x5000, 0x90C1, 0x9181, 0x5140, 0x9301, 0x53C0, 0x5280, 0x9241,
        0x9601, 0x56C0, 0x5780, 0x9741, 0x5500, 0x95C1, 0x9481, 0x5440,
        0x9C01, 0x5CC0, 0x5D80, 0x9D41, 0x5F00, 0x9FC1, 0x9E81, 0x5E40,
        0x5A00, 0x9AC1, 0x9B81, 0x5B40, 0x9901, 0x59C0, 0x5880, 0x9841,
        0x8801, 0x48C0, 0x4980, 0x8941, 0x4B00, 0x8BC1, 0x8A81, 0x4A40,
        0x4E00, 0x8EC1, 0x8F81, 0x4F40, 0x8D01, 0x4DC0, 0x4C80, 0x8C41,
        0x4400, 0x84C1, 0x8581, 0x4540, 0x8701, 0x47C0, 0x4680, 0x8641,
        0x8201, 0x42C0, 0x4380, 0x8341, 0x4100, 0x81C1, 0x8081, 0x4040
};

static inline uint16_t crc16_byte(uint16_t crc, const uint8_t data)
{
    return (crc >> 8) ^ crc16_table[(crc ^ data) & 0xff];
}

uint16_t calccrc16(uint8_t const *buffer, size_t len)
{
    uint16_t crc = 0x0000;

    while (len--)
        crc = crc16_byte(crc, *buffer++);
    return crc;
}

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s <firmware>\n", argv[0]);
        exit(1);
    }
    FILE* flashFile = fopen(argv[1], "r");

    // get filesize
    fseek(flashFile, 0, SEEK_END);
    uint32_t filesize = ftell(flashFile);
    fseek(flashFile, 0, SEEK_SET);

    // read the header
    header_t header;
    if (fread(&header, sizeof(header_t), 1, flashFile) == 0)
    {
        fprintf(stderr, "Error: Can't read header.\n");
        exit(1);
    }

    // check that it's a valid header as far as we know
    if (header.hex02 != 0x02 ||
        header.zero != 0 ||
        filesize != header.filesize ||
        strncmp(header.dellHeaderStr, "DELL_INC", 8) != 0)
    {
        fprintf(stderr, "Error: Header not valid.\n");
        exit(1);
    }

    // calculate header crc
    fseek(flashFile, 0, 0);
    uint16_t totalHeaderSize = sizeof(flc_block_t) * header.numBlocks + sizeof(header_t);
    uint8_t headerBuf[totalHeaderSize];
    fread(&headerBuf, totalHeaderSize, 1, flashFile);
    uint16_t headerCRC16 = calccrc16(headerBuf, totalHeaderSize);

    // calculate total file crc
    fseek(flashFile, 0, 0);
    uint8_t fileBuf[header.filesize-2];
    fread(&fileBuf, header.filesize-2, 1, flashFile);
    uint16_t fileCRC16 = calccrc16(fileBuf, header.filesize-2);
    uint16_t fileCRC16Dell;
    fread(&fileCRC16Dell, 2, 1, flashFile);
    printf("\n\n");
    printf("Valid Dell PowerEdge BMC firmware header found:\n\n");
    printf("  - number of blocks : %d\n",   header.numBlocks);
    printf("  - oemstr (fixed)   : %s\n",   header.dellHeaderStr);
    printf("  - total file size  : %d\n",   header.filesize);
    printf("  - total header size: %d\n",   totalHeaderSize);
    printf("  - header CRC16     : 0x%04x\n",   headerCRC16);
    printf("  - total file CRC16 : 0x%04x\n\n", fileCRC16);
    if (fileCRC16 == fileCRC16Dell)
       printf("  * CRC16 check OK\n");
    else
       printf("  * CRC16 check FAILED, actual CRC16 is 0x%04x instead of 0x%04x\n", fileCRC16, fileCRC16Dell);

    printf("\n\n");

    // read all blocks
    fseek(flashFile, sizeof(header_t), 0);
    flc_block_t flcBlock[header.numBlocks];
    fread(&flcBlock, sizeof(flc_block_t), header.numBlocks, flashFile);

    uint8_t i;
    for (i = 0; i < header.numBlocks; i++)
    {
        // check if our understanding of format is correct
        if (flcBlock[i].zero1 != 0 || flcBlock[i].zero2 != 0 || flcBlock[i].zeros[0] != 0 ||
            flcBlock[i].zeros[1] != 0 || flcBlock[i].zeros[2] != 0)
        {
            fprintf(stderr, "Error: Block %d not valid.\n", i);
            exit(1);
        }

        printf("Block %d:\n\n", i);
        printf("  - type     : %d/0x%02x (defines block type, 0x0b is sensor data table)\n", flcBlock[i].type, flcBlock[i].type);
        printf("  - system # : %d/0x%02x (running number for systems in this firmware file)\n", flcBlock[i].system, flcBlock[i].system);
        printf("  - unknown  : %d/0x%04x (always same for all blocks in a single firmware file)\n", flcBlock[i].unknownFixedData, flcBlock[i].unknownFixedData);
        printf("  - offset   : %d\n", flcBlock[i].offset);
        printf("  - length   : %d\n", flcBlock[i].length);
        printf("  - filename : %s\n\n", flcBlock[i].filename);

        // extract the block according to the offset and length given in the block desc.
        printf("  * extracting block...");
        char* blockData = (char*) malloc(flcBlock[i].length);

        fseek(flashFile, flcBlock[i].offset, 0);
        fread(blockData, flcBlock[i].length, 1, flashFile);

        FILE* blockFile = fopen(flcBlock[i].filename, "w");
        fwrite(blockData, flcBlock[i].length, 1, blockFile);
        uint16_t blockCRC16 = calccrc16(blockData, flcBlock[i].length);
        fclose(blockFile);
        free(blockData);
        printf("done.\n");

        if (blockCRC16 == flcBlock[i].crc16)
          printf("  * CRC16 check OK\n");
        else
          printf("  * CRC16 check FAILED, actual CRC16 is 0x%04x instead of 0x%04x\n", blockCRC16, flcBlock[i].crc16);

        printf("\n\n");
    }

    fclose(flashFile);
    exit(0);
}

The names of the individual files are listed below. They are organized in blocks (that’s what I call them), and apparently by function. Get the latest BMC firmware (30/6/2009, v1.83, A10) and apply my program to retrieve the individual files.

  • block 0 (code, big files):
    • BB.FLC
    • OB.FLC
    • ID.FLC
    • OEM_DEF.FLC
  • block 1 (*_BB files):
    • SD_BB.FLC
    • FI_BB.FLC
    • TOC_BB.FLC
    • IO_BB.FLC
    • IS_BB.FLC
    • OEM_BB.FLC
  • block 2 (*_K_C files):
    • SD_K_C.FLC
    • FI_K_C.FLC
    • TOC_K_C.FLC
    • IO_K_C.FLC
    • IS_K_C.FLC
    • OEM_K_C.FLC

The BMC seems to be little-endian (makes only sense I guess). I’ve scanned the different files for appearances of the threshold values (900 and 2025/0x07e9 and 0x0384 in int and 0x6144000 in float). No avail. Darn. Either I am doing something wrong or the thesholds are not hard-coded in the firmware (I had my hopes up when I saw the OEM_DEF.FLC file, which actually contains the default BMC password and the like). Maybe the thresholds are stored in the configuration flash after all – only how can we access it?

Update

I finally managed to adjust the critical fan thresholds by patching the BMC firmware! Here’s the howto. Additionally, I created a project page for my server.

The battle againt the BMC – Part 1

Earlier, I wrote about the problem of the noisy fans in my Dell PowerEdge 2800. Since then, I investigated a bit more. Just as a reminder: I can’t run silent fans because they have a lower RPM than a hardcoded panic-threshold of the PowerEdge. *grrr*

Brainstorm:

  1. make the fans faster/buy faster fans
  2. make the system hotter
  3. hack the fans into reporting more RPM than they actually do
    1. hack the fans themselves
    2. alter their tacho signal
  4. hack the BMC
  5. find out how the OEM sets these thresholds

Well, as I mentioned earlier 1. and 2. are for obvious reasons dissatisfactory.

Couple of fans taken apart

A couple of fans taken apart. Notice the blob of brown paint on the ring magnet of the fan on the right.

Concerning 3A, I took a couple of fans apart, looking at how they create the tacho signal. Almost all fans I opened (luckily I have a whole stack of noisy, throw-away fans lying around) have a sensor sitting just under the ring magnet, which is part of the rotor as can be seen in the photo on the right. I had no idea what this sensor might be, but I noticed on all of the fans one or more blows of brownish-red paint. I figured that this paint might be used to create a signal for the sensor – and I did some tests with other paint in order to replicate the effect (left fan on the photo with magnetic paint applied). Well, nice idea, but total bullshit as it turns out. The sensor is a hall-sensor that senses the change in the magnetic field of the ring-magnet, and thus changes inevitable 2 times per revolution. I figure that the paint applied on the rotor is used for calibrating the fans… Well, it was a nice idea.

3B might be an option, but it would require either a microcontroller or some analogue circuit – not really what I want to fiddle into the fan trays of 6 fans.

Concerning option 5, I thought that there may be hidden ipmi OEM commands for configuring the thresholds. I dug around the Dell OEM extensions for ipmitool (can be retrieved from the Dell Linux Community Repositories). This code officially earned worst code of the year – I completely understand why the ipmitool maintianers flatout refuse to integrate that piece of crap. It’s a hacked-up collection of extensions, seemingly done on the fly to fix customer problems. Horrible. Even more so as it does not seem to be able to set the thresholds either. After a few hours of digging in the code I managed to query the BMC sensors with Dell’s OEM commands, and the returned capability flags do indicate that the thesholds cannot be changed. Darn.

Now I am back to hacking the BMC firmware – but that’s for another post…

Update

I finally managed to adjust the critical fan thresholds by patching the BMC firmware! Here’s the howto. Additionally, I created a project page for my server.

Latex+Beamer+PDF+embedded movies

Finally I found a way to embed videos into Latex-generated PDFs in a platform-independent way!

Download and install flashmovie.sty:

wget http://mirror.ctan.org/macros/latex/contrib/flashmovie.zip
unzip flashmovie.zip
cp flashmovie/flashmovie.sty $YOUR_TEX_DOCUMENT_DIR

Additionally, you need a flash-based video player. Free option:

cp flashmovie/flashmovie/player_flv_maxi.swf $YOUR_TEX_DOCUMENT_DIR

I preferred the Longtail Video Player which is free for personal use:

wget http://www.longtailvideo.com/jw/upload/mediaplayer.zip
unzip mediaplayer.zip
cp mediaplayer-5.7/player.swf $YOUR_TEX_DOCUMENT_DIR

This is the mencoder command I use for converting my videos:

mencoder -nosound -forceidx -of lavf -ovc lavc -lavcopts vcodec=flv:vbitrate=2500:mbd=2:mv0:trell:v4mv:cbp:last_pred=3 -o video.flv video.avi

And finally, a bare example showing how to create an embedded, full-screen movie in a presentation using the beamer package:

\RequirePackage{flashmovie}
\documentclass[utf8x]{beamer}
\usepackage[absolute,overlay]{textpos}
\setlength{\TPHorizModule}{1mm}
\setlength{\TPVertModule}{1mm}

\begin{document}
  \begin{frame}[plain]
    \begin{textblock}{12.8}(0,0)
      \flashmovie[auto=1,loop=1,controlbar=0,engine=jw-player,width=12.8cm,height=9.6cm]{taros_talk.flv}
    \end{textblock}
  \end{frame}
\end{document
}

The options are pretty much self-explanatory. Ah yes, the whole thing needs an Adobe Reader >=9.0.

Update: The latest version of Adobe Reader for Linux (9.4-2) gives a “3D parsing error” upon opening a page with an embedded flash video. Downgrade to 9.4-1 (Ubuntu package name is acroread_9.4-1, you can get it here) and everything’s fine