§ Wiki · Wiki entry

Checking Node CPU and Memory Speed

How to use a live Ubuntu USB and sysbench to measure memory bandwidth on a node, plus the firmware and BIOS fixes for known underperforming configurations.

A node that is running but failing block-maker checks is sometimes suffering from a hardware-level performance problem — a buggy firmware revision, a marginal power supply, or a power-redundancy configuration that throttles the CPU. The fastest way to confirm or rule that out is to boot a live Ubuntu USB on the affected machine and run a memory-bandwidth benchmark.

[!WARNING] Use a live Ubuntu USB image — do not run the installer and do not let it touch the disks. Wiping the IC-OS install would force a full redeployment.

Run the test

  1. Power the node off through the BMC (see Powering down via the BMC below).

  2. Insert a live Ubuntu USB stick and boot from it — choose Try Ubuntu, not Install.

  3. Open a terminal and install sysbench:

    sudo apt update
    sudo apt install sysbench
    
  4. Run the memory benchmark:

    sysbench --test=memory run
    
  5. Read the transferred line at the end of the output. Healthy nodes should report at least 5.6 GB/s of memory throughput.

A result well below that threshold — say, 2.6 GB/s — points at a hardware or firmware problem rather than software.

Known issues and fixes

Dell PowerEdge

Some Dell PowerEdge nodes have shipped with a CPLD firmware revision that caps memory bandwidth around 2.6 GB/s. Updating the CPLD firmware restores full performance. See Updating Firmware for how to obtain and apply Dell firmware packages.

Supermicro

Two fixes have helped on Supermicro nodes:

  • Power-cycle the chassis. A clean cold start sometimes clears the throttle.
  • Disable an NUMA-related BIOS option. Enter BIOS and navigate to Advanced > ACPI Settings > ACPI SRAT L3 Cache As NUMA Domain, then set it to Disabled.

Powering down via the BMC

[!TIP] When servicing a PDU or running power-side diagnostics, power the attached servers down through the BMC first. Yanking power on a live IC-OS install can leave the node in a state that needs recovery.

Use the BMC web UI or ipmitool from a separate workstation to issue a graceful shutdown.