Blog

Hyundai Ioniq Electric (28 kWh) battery aging

Henry — Wed, 08 Jul 2026 05:45:35 +0000

One of the challenges of buying a used electric vehicle is knowing the health of the battery, which is usually the single most expensive component of the car. What is the typical battery degradation with age, and is this particular car any worse than typical? I know of no quick way to test it before buying, when you typically don’t have the opportunity to drive it from full to empty.

I recently bought a used 2017 Hyundai Ioniq Electric (28 kWh), and wanted to know how much battery capacity remained.

This is my attempt at measuring the battery capacity, by measuring the amount of energy delivered by the battery when driving from fully-charged to near-empty.

Methodology

Drive from 100% to a very low state of charge
Using an OBD reader, look at the “Cumulative energy charged” and “Cumulative energy discharged” numbers at the beginning and end of the drive.

Multiplying the efficiency reading on the dashboard (kWh/100 km) by the trip distance doesn’t seem particularly accurate, even when the counter is configured to reset after a recharge. I suspect the efficiency counter is some kind of weighted average (preferring more recent observations) rather than strictly an average since the last reset.

Results

	Charged (kWh)	Discharged (kWh)	Cell voltage
100% (95% BMS)	32735.2	31585.0	4.12 V
4% (5.5% BMS)	32736.4	31612.2	3.18 V – 3.24 V
Difference	1.2	27.2	–

Regenerative braking is considered charging, which increases the “cumulative energy charged” counter during the trip. This complicates the calculation somewhat because I need to subtract the extra discharge energy available due to being charged by regenerative braking to get the total energy available solely from the battery. The efficiency of the battery (lifetime energy discharged divided by lifetime energy charged) is apparently around 96.6% (31612/32736 = 0.966).

The total energy discharged from the battery minus the energy gained from recharging gives the total energy delivered by the battery on my trip: 27.2 kWh − (0.966 × 1.2 kWh) = 26.0 kWh.

If I trust the state of charge numbers (100% start, 4% end), then scaling to the full capacity of the battery (from 100% to 0%) would be 26.0 kWh / 0.96 = 27.1 kWh.

It appears my nine-year-old battery can still deliver 27.1 kWh, or 97% of the rated capacity. The car manual says the battery has a capacity of 28 kWh, but I don’t have a measurement of the actual capacity when it was new (which could have been higher).

Although the Hyundai Ioniq Electric is known to have low battery degradation, losing only 3% in 9 years seems unreasonably good. Unfortunately, without a measurement from when the battery was new, I can’t distinguish between very low degradation vs. the battery originally exceeding its rated capacity. I will have to repeat the experiment to see how it degrades in future years.

Battery stats (July 2026)

Age: ~9 years (Manufactured June 2017), 150,000 km
State of health: 96.9%
Fully charged (100% display, 95% BMS): 398V total, 4.14V per cell, no deviation (20mV resolution)
Fully discharged (0% display, 2% BMS): 295V total, 3.02V – 3.10V per cell

More Measurements

A summary of results from repeating this experiment over time.

Date	State of charge	Energy Discharged	Energy Charged	Cell Voltages (end of test)	Capacity (100%→0%)
2026-06-20	100% → 4% (5.5% BMS)	27.2 kWh	1.2 kWh	3.18 – 3.24 V	27.14 kWh
2026-07-07	99.5% → 0% (2% BMS)	31.0 kWh	4.1 kWh	3.02 – 3.10 V	27.18 kWh

After doing this twice, the experiment seems repeatable even with different driving styles. The first experiment was mostly 100 km/h over several hours. The second experiment had mainly city driving (and much more AC use) over three days. The calculated battery capacity differed by under 1%.

Tyan GT62D-B7106 + S7106GMR-CGN: Why are CPUs slower than expected?

Henry — Sun, 28 Sep 2025 01:08:16 +0000

I recently got a used 1U server:

Tyan GT62D-B7106
S7106GMR-CGN motherboard (Dual LGA 3647)

I noticed that the CPUs were running abnormally slowly, not even reaching base clock frequency when all cores are active, even when the temperature isn’t particularly high. Why?

There seems to be a limit of around 180 W for both CPUs (or 90 W per CPU if both are active), and this limits is independent of the TDP rating of the CPU. So if I have two 125 W TDP CPUs, they will only use around 90 W each and not reach base clocks. This gap is even bigger with 165 W and 205 W CPUs, which also will run at ~90 W. But if I have only one 165 W CPU installed, it will happily run at 165 W and run at the expected frequencies, which suggests that CPU power delivery (which isn’t shared between sockets) isn’t the limit.

So where is this power limit coming from?

It turns out the limit is due to the 500 W power supply that comes with the Tyan GT62D-B7106 system. The power supply communicates via PMBus to the motherobard, which results (somehow) in limiting the system power consumption, which results in the CPUs getting a rather conservative limit of 180 W. But this is a system power limit, not a per-CPU power limit.

This power limit can be bypassed by disabling PMBus in the BIOS settings. (Disabling PMBus means I also lose voltage/current/power/temperature/fan monitoring of the power supply.)

But of course, limiting the CPU power to fit the power supply does have a purpose. With PMBus turned off, there is no longer a reasonable limit on CPU power usage, so it’s very easy for the CPUs to consume enough power to reset the machine by overloading it. For example, with two Xeon 8173M CPUs with a PL2 of 363 W each, I can’t successfully boot into the OS without the system resetting itself.

Now that I know where the power limit is coming from, there are two ways to deal with this:

Replace power supply with one of higher power
Limit power consumption

On this system, I can manually set the PL1 and PL2 CPU power limits to limit power consumption. For a system with two Xeon 8173M CPUs (28-core 165 W TDP) the maximum setting that does not cause spontaneous reboots is 145W per CPU (for both PL1 and PL2). I’m giving up a little bit of sustained performance (145 W PL1 vs. 165 W) and some more burst speed (145W PL2 vs. 363 W), but this is the best I can do without replacing the power supply. This is still running the CPUs slower than the capability of the CPUs, but 145 W is a lot better than 90 W.

Shell Scripting in C++

Henry — Fri, 28 Feb 2025 09:26:14 +0000

Do you like C++?
Do you like the convenience of shell scripts where you just execute the source code without needing a separate build/compile step?
Do you like both so much that you wish you could write shell scripts in C++?

If you’re still nodding, here’s one possible way to do it:

#!/bin/bash
CXX=g++ ; CXXFLAGS="-O2"
fname=$(mktemp --tmpdir cscript.XXXX) ; exec 999<"$fname" ; rm "$fname"
sed '1c#if 0' "$0" | $CXX $CXXFLAGS -xc++ - -o /dev/fd/999 && exec -a "$0" /dev/fd/999 "$@" ; exit 1
#endif

#include 
int main(int argc, char *argv[])
{
    printf ("%d arguments:\n", argc);
    for (int i=0; i < argc; i++) {
        printf ("%d: %s\n", i, argv[i]);
    }
    return 42;
}

This is a POSIX shell script with embedded C++ source code (starting at line 6). The shell script runs the C++ compiler on itself to produce an executable, then runs the executable. (This isn’t a new idea, though it wasn’t easy to find examples of it. Here are a few examples: 1, 2, 3. The same technique is also used to make self-extracting archives, with shell commands at the beginning to untar the data contained later in the same file.)

Line 2 sets two variables to choose the compiler and compiler flags.
Line 3 creates a temporary file, opens it as file descriptor 999 (that hopefully doesn’t collide with a file descriptor already in use), then unlinks (rm) the file. Unlinking the file here is one way to ensure the temporary file gets deleted even if compilation or execution fails. The file gets deleted once there are no more open file handles, but remains accessible by file descriptor (and /dev/fd/999 and /proc/self/fd/999) while the file is still open.
Line 4 uses sed to replace the first line of the current script with “#if 0”, which matches the “#endif” on line 5, hiding the shell code from the C++ compiler. This is then piped to g++, which writes the executable to the temporary file. The -xc++ flag is needed because the input file name does not have a .cc/.cpp extension, so we need to explicitly declare the input language as C++ source code. Then we exec the temporary file (the temporary file is accessed via /dev/fd/999 rather than file name, since we already unlinked the file).

Using sed to replace the first line has the advantage of not changing the number of lines of input (compared to using just cutting off the first few lines using tail +6), so that any C++ compiler warning/error messages still have line numbers that match the source file.

The C++ code begins on line 6.

If we put the above into a file named test.sh and run it, we get this:

$ ./test.sh hello world
3 arguments:
0: ./test.sh
1: hello
2: world
$ echo $?
42

A C++ shell script that receives command-line arguments and produces an exit status.

Compile time overhead

One of the disadvantages of a compiled language is that the compiler itself takes a non-negligible amount of time to run. For the example above when run with no arguments, it takes about 52ms for the C++ version (~50ms compile + 2ms run), but only 10ms for an equivalent Bash version. The Bash version is below:

#!/bin/bash
echo "$(($# + 1)) arguments:"
I=0
echo "0: $0"
for a in $@; do
    echo "$I: $a"
    I=$((I+1))
done
exit 42

One way to mitigate the compile time is to use a compiler cache (ccache) so that repeated execution of the same script does not need a full recompile. The above script needs to be modified for ccache, because ccache can only cache the compile step (not link), and ccache requires the compiler input to be from a file (not pipe).

#!/bin/bash
CXX="g++" ; CXXFLAGS="-O2" ; CCACHE=`which ccache 2>/dev/null`
fname=$(mktemp --tmpdir cscriptXXXX) ; exec 998<"$fname" ; rm "$fname"
fname=$(mktemp --tmpdir cscriptXXXX) ; exec 999<"$fname" ; rm "$fname"
sed '1c#if 0' "$0" > /dev/fd/998
$CCACHE $CXX $CXXFLAGS -xc++ -c /dev/fd/998 -o /dev/fd/999 && $CXX $CXXFLAGS /dev/fd/999 -o /dev/fd/998 && exec -a "$0" /dev/fd/998 "$@" ; exit 1
#endif

#include 
int main(int argc, char *argv[])
{
    printf ("%d arguments:\n", argc);
    for (int i=0; i < argc; i++) {
        printf ("%d: %s\n", i, argv[i]);
    }
    return 42;
}

This time, we use two temporary files (because both the compiler input and output must be files), and we use ccache for the compile phase (ccache cannot cache the link phase). This cuts the total (ccache hit + link + execution) time to around 40ms. This is still slower than the Bash version. However, I expect the savings to be greater with longer programs that take longer to compile.

For this trivial program, the C++ version outperforms the Bash version when the number of input arguments exceeds around 12000. The faster execution time of the C++ code eventually pays for its overhead.

But this script isn’t C++ anymore…

That’s true. If you tried to compile this script with a C++ compiler, it would complain that the #!/bin/sh on the first line isn’t legal C preprocessor. I don’t see a way to make the shebang (#!) legal C code, but it is possible to remove it entirely and replace it with #if 0, and hope that the shell used to execute the script is POSIX compatible. This works because #if 0 is a comment in POSIX shell script, turning this file into both legal C++ code and shell script. Removing the shebang is an option if being able to feed the script unmodified through a C++ compiler is more important to you than not being completely certain which shell executes the script.

#if 0
CXX="g++" ; CXXFLAGS="-O2" ; CCACHE=`which ccache 2>/dev/null`
fname=$(mktemp --tmpdir cscriptXXXX) ; exec 998<"$fname" ; rm "$fname"
fname=$(mktemp --tmpdir cscriptXXXX) ; exec 999<"$fname" ; rm "$fname"
sed '1c#if 0' "$0" > /dev/fd/998
$CCACHE $CXX $CXXFLAGS -xc++ -c /dev/fd/998 -o /dev/fd/999 && $CXX $CXXFLAGS /dev/fd/999 -o /dev/fd/998 && exec -a "$0" /dev/fd/998 "$@" ; exit 1
#endif

// ... C++ code

Microwave Oven Failure: Spontaneously turned on… by its LED display

Henry — Sat, 29 Jun 2024 06:58:13 +0000

My microwave oven started to malfunction at around five years old. It started to randomly power on the lamp, fan, and turntable. It progressively got worse over several weeks until it was mostly stuck on. The microwave oven is not usable when this happens: It behaves as if the door were open, causing the control panel to ignore button input and to stop cooking if it was cooking.

The obvious suspect is a failing door switch, which is a common cause of failure. There are three switches in the door, and a failure of one or more of them can cause strange behaviour when not all three switches agree on whether the door is open or closed. However, all three switches were tested to be in good working condition, so the most obvious reason is not the cause of this failure.

My microwave is an Insignia NS-MW09SS8 (a Best Buy brand), which is manufactured by Midea (FCC ID: RSFXM925AYY), with model number EMXAUXX-05-K marked on the circuit board inside. There are many other brands/models that use the same internal components. Unlike most similar models, mine has a blue LED display.

Spontaneously turning on is apparently not an uncommon failure. The one-star reviews on the Best Buy website for this model has almost 40 reports of this exact symptom. Many people (unnecessarily) worried that spontaneously turning on was a fire hazard. None of them seem to have found the root cause.

What went wrong?

The microwave “turning on by itself” was caused by an aging/failing LED display. Yes, really.

This unexpected conclusion is worth a blog post explaining exactly what went wrong, why it causes the observed symptoms, and how I repaired it.

This is a summary:

If the lamp is on and the door is closed, the turntable and fan also turn on. The magnetron stays off, so there is no fire hazard. This is expected behaviour.
The control board thought the door was open even when it is actually closed. This causes the lamp to turn on.
This control board uses the same microcontroller GPIO pin to both drive segment A of the LED display and sense the door switch.
Due to aging of the display’s LEDs, there is enough reverse-biased leakage through the LEDs to cause the door switch to be incorrectly sensed as open, causing the microcontroller to incorrectly think the door was open.

Microwave Oven Internals

Control Board

This is a photo of both sides of the control board. The burnt discolouration on the back side underneath the LED display is due to desoldering the display with hot air for repair.

Schematic

Here is a schematic for most of my microwave oven, reverse-engineered by following traces and cables. The schematic excludes the power supply section (switched-mode +5V and +12V supplies).

Lamp, Fan, and Turntable

Normally, when the door is open, the lamp turns on. When cooking, the lamp, fan, and turntable all turn on. What mechanism implements this behaviour, and how does it explain the malfunctioning behaviour where the lamp, fan, and turntable are all on?

Above is the portion of the circuit that controls the lamp, fan, turntable, and magnetron. Notice how there is only one relay (RLY2, turned on by Q2) that controls the lamp, fan and turntable. The microcontroller actually cannot independently control these. When the microcontroller turns on the lamp relay (RLY2), the upper door switch (not the microcontroller) decides whether only the lamp turns on, or all three.

In the failed state, the microcontroller thought the door was open (according to the lower door switch) and turned on the lamp (RLY2) in response. But the door was closed according to the upper door switch, so the fan and turntable also turn on when the lamp is turned on. This behaviour is expected, but what’s unexpected is the microcontroller detecting an open door when the door is actually closed.

An interesting side effect of this design is that if the door-open lever is slightly pressed, just enough to toggle only the bottom door switch, the lamp, fan, and turntable all turn on. This can occasionally cause surprise and lead the user to wonder if the microwave oven is unexpectedly cooking (It is not).

Door Switch Sensing

Why does the microcontroller think the door is open when it isn’t? How does the microcontroller sense whether the door is open?

This is the portion of the circuit that senses whether the lower door switch is closed (shorted to ground) or open. Although there are three door switches, the lower switch is the only one that informs the microcontroller about the state of the door.

The microcontroller pin “Port H.0” senses the door switch state. When the door switch is open (door open), R34 pulls the input pin (H.0) to +5V. When the door switch is closed (door closed), the input pin is pulled low (to about 0.6V, the forward voltage drop of the diode) through diode D1 and the door switch to ground. When microcontroller port H.0 is used as an input pin, we expect to see either +5V if the door is open, or around +0.6V if the door is closed.

D1 prevents any current from the magnetron relay (RLY1) from flowing through the microcontroller pin and accidentally turning on RLY1.

R33 exists because the same microcontroller input pin is shared with the LED display’s segment A (Notice how the Door_LEDA_H.0 signal is also connected to the LED display’s segment A). Sharing a pin between multiple functions reduces cost by reducing the number of I/O pins needed on the microcontroller. The microcontroller pin is set to drive either high or low when driving the LED display (8ms out of every 10ms), and is set to be a high-impedance input pin when it wants to sense the door switch. It is this pin sharing that causes the door switch to be incorrectly detected as open (high input voltage) despite the switch being closed. Sharing a pin allows a malfunctioning LED segment to disrupt the voltage when trying to sense the door switch.

When sensing the door switch, all four LED cathodes are pulled high (Ports D.0, D.1, A.2 and A.3), turning off all of the LEDs and putting them into reverse bias. When the door is closed, we expect PortH.0 to be a logic 0 (low voltage). But any reverse leakage current through any of the four segment A LEDs appears as a pull-up current applied directly to the input pin (port H.0) that must drain through the 2k-ohm R33. It doesn’t take much leakage current to raise the voltage on the input pin high enough to be detected as logic high.

In the oscilloscope traces below, the door switch sensing happens in the 2ms when the LEDs are off (labelled “Off”). Notice that before the repair (top two plots), when the door is closed, the Port H.0 pin voltage (yellow trace) only drops to around 2.2V, even though the expected voltage is around 0.6V if there was no LED display leakage current. 2.2V is more than enough to be sensed as a logic high. (When the door is open, the Port H.0 input pin voltage is +5V, as expected.)

LED Display

The LED display is common cathode and has four digits and two independently-controlled dots (part of digits 3 and 4, respectively) that form a colon. Each digit is turned on in sequence (left to right) for two milliseconds each, followed by two milliseconds with all digits off. The cycle repeats every 10 milliseconds.

The LED display shares a pin with the door switch sense circuit and also shares six pins with the control panel buttons. The door switch sensing seems to happen during the two milliseconds when all LEDs are off.

	Door open	Door closed
Before
Repaired

The four oscilloscope plots above capture 14 milliseconds (1 ms per grid square) showing the LED display being scanned, both with the door switch open and closed, before and after my repair. The yellow trace is microcontroller pin Port H.0, which drives LED segment A and senses the door switch. The other three traces show the cathode pin of some of the display digits (cyan = digit 1, magenta = digit 3, green = digit 4. I only have a 4-channel oscilloscope, so I omitted digit 2). Each LED digit is on when the cathode pin is low, so you can see each digit being turned on for two milliseconds each, followed by two milliseconds when all four digits are off. During the two milliseconds with the display off, Port H.0 (yellow trace) is set to be an input pin to sense the door switch.

Looking at the “Off” region, the two plots with the door open (left two plots, both before and after the repair) show +5V on the input pin, as expected. However, when the door is closed (upper-right plot), the voltage on pin Port H.0 only drops to +2.2V due to reverse leakage current through the display. This is high enough to be detected as a logic 1, so the microcontroller becomes unable to sense a closed door. After the repair (lower-right plot), the voltage on pin Port H.0 drops to around 1V when the door is closed (low enough to be sensed as logic 0), restoring normal functionality.

(When these traces were collected, the display was displaying 0:00, so segment A, the top horizontal LED of each digit (yellow oscilloscope trace), is high for digits 2, 3, and 4, but not digit 1).

Repair

Simple repair: Add a diode to block reverse leakage current on the pin that matters.

Repairing the door sense circuit is relatively easy (compared to finding the problem), requiring desoldering the display module, modifying some surface-mount components, and reinstalling the display. Since the main problem is reverse leakage on the segment A pin, I added a diode in series with the segment A LEDs (next to, and in series with R14) to prevent the leakage. I chose a schottky diode to minimize forward voltage drop and the reduction in brightness of segment A (Schottky diodes have lower forward voltage drop than silicon PN junction diodes). I could not visually notice any decrease in brightness, so a silicon diode would have probably worked fine too.

However, this only prevents the failing LED display from interfering with the door switch sensing circuit, and doesn’t actually fix the aging display. On my display, segment A on digit 3 no longer lights up most of the time.

This failure could have been prevented had the original circuit added this one diode. If the circuit added diodes to all 12 pins of the LED display, display aging might also be significantly reduced, because I suspect that applying 5V of reverse voltage across the LEDs contributes to its aging, and extra diodes in series prevents this. Most diodes can withstand higher reverse voltages than light-emitting diodes (LED).

Replacement (2024 December)

The LED display has continued to degrade. Reverse leakage current through other segments of the display interfered with sensing of the keypad keys, making the microwave oven unusable.

The display is a four-digit clock style (with a colon) LED seven-segment display, but the pin arrangement is so unusual that I was not able to find a replacement. The unusual arrangement probably comes from needing independent control over the two dots in the colon. The bottom dot is sometimes used as a decimal point, for example, showing 1.75/3.0/3.5 oz when “Popcorn” is pressed.

Since I couldn’t find a replacement display, I designed a module with equivalent functionality out of individual seven-segment digits and two discrete LEDs for the colon. For extra paranoia, I used red (not blue!) seven-segment displays and added silicon diodes in series with every LED to protect the LEDs from seeing the full 5V of reverse voltage. I used BAW56 dual diodes, which have two diodes in a three-pin package (common anode). Compared to separate two-pin diodes, dual diode packages save a little bit of board space and make it impossible to accidentally install the diodes backwards.

Since the replacement module is larger than the original display, some minor modifications to the foam gasket were needed to make it fit.

Above is what it looks like after being installed. It doesn’t quite look right because the two LEDs forming the colon illuminate the entire space between the second and third digits, and is quite visible. The solution is to cover that up with something black. Conveniently, the SOT23 tape (packaging for the diodes) is black, and even has perfectly sized and spaced holes to go over the two LEDs. Not only does it block out the light in the gap, the holes also partially cover the two LEDs to make the dots smaller, which I think looks better:

In case this is useful, here are my schematic and circuit board layout files, created with KiCad 5.1: microwave_display_kicad.zip

More observations

Magnetron Failsafe

I’m quite impressed by how many independent mechanisms there are to prevent the magnetron from accidentally turning on. Not only are there three door switches to ensure the door is closed, there’s also a mechanism to guard against a faulty microcontroller or software.

The magnetron high voltage transformer is turned on by relay RLY1, which also goes through two mechanical switches (upper and middle door switches). If the upper switch detects an open door (open switch), the circuit is opened and there is no power to the magnetron. If the middle switch detects an open door (closed switch), the transformer is shorted out and you would get a short circuit and trip a circuit breaker (or possibly even melt a switch), but still no magnetron power. These two switches are in the 120V AC path, so they can protect against a faulty RLY1 that is stuck on.

The path that controls the magnetron relay RLY1 has three separate controls in series (Q1, Q3, and lower door switch). If the lower door switch detects an open door (open switch), RLY1 doesn’t turn on. Once the door is closed, two transistors (Q1 and Q3) both need to be on to enable RLY1. Q3 is directly controlled by the microcontroller to turn on the magnetron relay RLY1. But there is also Q1, which only allows RLY2 to turn on if the lamp relay (RLY2) has also been commanded to power on (by Q2).

Q2 is the transistor that responds to the microcontroller command to turn on the lamp relay (RLY2). But unlike Q3, Q2 is not directly connected to a microcontroller pin. It goes through Q5 and several passive components first. I believe these components are designed to prevent a malfunctioning microcontroller from causing the microwave oven to be stuck on. C16 blocks any DC signal, so to turn on Q2, the microcontroller must output a continuous pulse train to periodically turn on Q5 to charge up capacitor E1, which turns on Q2. If the microcontroller stops toggling the output pin (Port C.2) for any reason (hardware failure, software crash), Q2 turns off once E1 discharges (~20 ms?), making it impossible for the microcontroller to freeze with the lamp and magnetron stuck on.

Buttons (Updated 2024 December)

There are 24 buttons on the control panel, arranged (electrically) as a 6 x 4 matrix, with 6 + 4 (row and column) wires leading to the control panel. Six of these wires are shared with the LED display anode wires. This probably means the buttons are scanned while the LED display is off (to allow using the shared wires without showing anything on the display), but I haven’t attempted to figure out exactly which part of the 10ms cycle this happens.

Since six of the button wires are shared with the LED display, and presumably because those six are used as inputs, the LED display reverse leakage current in the display can also affect the buttons. This could explain the reports of malfunctioning buttons in some of the reviews.

Speculation: Control panel failure could have been avoided by using the six shared wires as output and four dedicated wires as input, instead of the other way around. When scanning a keypad matrix, each of the row wires is driven low (or high) one at a time, and the high-impedance column wires sense whether each of the column wires have been pulled low (or high) to know which keys in that row are pressed. (Pressing a key connects one row wire to one column wire). High-impedance input column wires are sensitive to external disturbances (like a leaky LED), while the low-impedance row wires being driven are much less sensitive. If the six shared wires were used as rows instead, it would take 50% longer to scan the keypad array (6 rows instead of 4), but it would have been immune to disturbance from the LED display, since the wires that are sensitive to disturbances would be the four dedicated wires.

The following table shows how the buttons are arranged in the 6 x 4 matrix (arranged visually in a 3 x 8 grid):

	Button1 E.2	Button2 F.3	Button3 F.2	Button4 F.1
ButtonA LedC D.2	Time Cook	Popcorn	0	6
ButtonB LedDot D.3	Time Defrost	Potato	1	7
ButtonC LedD B.0	Weight Defrost	Pizza	2	8
ButtonD LedE B.1	Power	Frozen Vegetable	3	9
ButtonE LedF B.2	Clock	Beverage	4	Stop Cancel
ButtonF LedG B.3	Kitchen Timer	Dinner Plate	5	Start +30sec

Clock…?

Port E.3 (Pin 1 of the SH69P26K microcontroller) connects to an optocoupler (IC102) that’s driven by the live wire of the AC input. This circuit looks like it is intended to generate a pulse once per AC cycle (60 Hz) going to the microcontroller. I suspect this signal is used as a 60 Hz clock to run the timer and clock. I haven’t attempted to test this hypothesis.

According to the datasheet, this microcontroller supports both internal and external oscillators, but this board seems to leave the OSCO and OSCI pins unconnected, which means the microcontroller is using the internal oscillator option. The internal oscillator is rated for up to 50% relative frequency error, nowhere near accurate enough for a clock. I’m surprised that they chose to add a bunch of components to feed the AC line frequency to the microcontroller instead of just using a 32.768 kHz crystal. A single crystal oscillator seems like both the cheaper and more accurate option, especially if someone were to take the microwave to a place that does not use 60 Hz mains frequency.

Model selection “switches”

There are six switches to select between similar models of microwave without needing a different board or software program. The six “switches” are actually small tabs in the circuit board that are snapped off to open the switch. For my microwave oven, SWA is snapped off. Four of these switches (SWA to SWD) share pins with the LEDs.

Conclusions

The “microwave turns itself on” failure is caused by a failing LED display causing the microcontroller to detect the door as open even when it is closed. This failure is possible due to sharing microcontroller I/O pins between different functions to cut cost, allowing malfunctions in one area to affect another. Sharing pins is probably not an uncommon thing to do, but the design should be more tolerant of LED reverse leakage current. This design vulnerability is compounded by the use of blue LEDs for the display, which fail more quickly than other colours (most commonly green).

When designing an appliance, avoid sharing GPIO pins between LEDs and anything functionally important, or at least design it to tolerate a significant amount of reverse leakage current. LEDs can fail with age.
If you’re buying a microwave oven, it’s fairly hard to know which models have this vulnerability: Midea makes a fairly large fraction of all microwave ovens, multiple Midea models have the same problem, but not all models do.
But you can greatly improve your chances by avoiding blue LED displays.

Does display colour really matter?

Anecdotally, I’ve noticed that blue indicator LEDs tend to fail much more frequently than other colours. But is this observation also true here?

To find out whether display colour matters, I compiled a list of microwave models similar to mine that I’m fairly certain either use the same mainboard, or has a similar mainboard that most likely has the same vulnerability. Then I read through the one-star reviews (mainly people complaining about failures) and count how many of those reviews describe the microwave randomly turning on when the door is closed. I observe whether the colour of the display affects how often this failure mode shows up in one-star reviews. If blue displays really are less reliable, I should see more failures of this type for models with blue displays than for similar models with green displays. The following table summarizes the results.

It’s quite clear that all three models with blue displays have a much higher rate of failing by randomly turning on and off with the door closed. For one-star reviews, it is ~25 times more likely to get a complaint about spontaneously turning on when the LED display is blue compared to when it is green.

Brand/Model	FCC ID	Mainboard Part Number	Display Colour	Total Reviews	One-star reviews	One star and randomly turning on	Review Source
Insignia NS-MW09SS8 (mine)	RSFXM925AYY	17170000006520	Blue	6789	244	37	Best Buy
Sharp SMC0912BS			Blue	291	54	29	Lowe’s
Black & Decker EM925AAK-P		17170000006520	Green	425	122	1	Target
Black & Decker EM925AB9		17170000006520		981	210	0	Amazon
Black & Decker EM925AZE-P				129	29	0	Amazon
Emerson MW9325SL
Farberware FM09SSE				128	17	0	Amazon
Frigidaire FFCM0934LS
Hamilton Beach EM925A2CE-P1				690	81	2	Walmart
High Pointe EM925ACW
Insignia NS-MW09BK0				478	23	1	Best Buy
Insignia NS-MW09RD7
Kenmore 405.73093310, 405.73099310, 405.73092310
Pelonis EM925AFO-P2
Salton MW2079
Toshiba ML2-EM25PA(BS)				(~824/2) *	(~75/2) *	1	Lowe’s
Westbend EM925AJW-P2 **
Westbend EM925AJW-P2 **	VG8EM925AYY
Insignia NS-MW11BK0	VG8XM031MYY		Blue	2630	89	18	Best Buy
Black & Decker EM031MB11		17170000000832	Green	763	165	1	Amazon
GE JES1145SH1SS				7659	280	5	Home depot
Hamilton Beach EM031M2ZC-X3				2739	238	0	Walmart
Toshiba ML2-EM31PA(SS)				(~824/2) *	(~75/2) *	0	Lowe’s
Black & Decker EM036AB14	VG8XM036AYY	17170000006520		220	45	1	Amazon
Toshiba EM925A5A-SS	VG8EM025FXXXV2	17170000006520		2470	431	3	Amazon
Toshiba ML2-EM09PA(BS)	VG8EM025FXXXV2		White?	317	55	1	Amazon
Black & Decker EM720CB7	VG8XM720CYY-PM	17170000000832	Green	1316	355	3	Amazon
Magic Chef HMM770B2				705	59	0	Home depot
Midea MMC07S1ABB				198	17	0	Lowe’s
Sharp SMC0710BB				395	11	0	Lowe’s

* The Toshiba ML2-EM25PA(BS) and ML2-EM31PA(SS) reviews are combined and there was no easy way to get a count for each model separately.
** The Westbend EM925AJW-P2 seems to be available with two different FCC IDs. The only difference between them appears to be the manufacturer of the magnetron (Witol vs. LG).

The FCC ID is useful for identifying models that should be substantially the same, even if they have cosmetically different styling. I’ve grouped microwave models by FCC ID in the table above. Models with the same FCC ID do have some mainboard variations over the years. The changes are usually small (otherwise the FCC ID would change), and I don’t know which mainboard revision was used for each customer review, so I don’t try to distinguish differences within the same model, if any.

Another source of information is from the mainboard part number: Some websites list which microwave models used that particular board, which tells me a set of models that share the same mainboard despite having different FCC ID, size, and power. I think it’s reasonable to assume that all models with the same FCC ID share the same mainboard (or revisions of it), but I only filled in the Mainboard Part Number in the table if I found some evidence of that mainboard being used in that particular model.

Calculator

Henry — Sun, 13 Aug 2023 12:30:39 +0000

I wrote a calculator that can do decimal and hexadecimal arithmetic conveniently.
https://calc.stuffedcow.net/ (New window)

As a computer engineer, I often do calculations with a mix of decimal and hexadecimal numbers. But I haven’t been able to find a calculator that can work in both bases conveniently. Many calculators don’t support hexadecimal at all (e.g., HP 30b). Some calculators do support it, but it’s buried in layers of menus with no easy way to enter A through F, so it’s much slower than computing in decimal (e.g., HP 42S).

Some software calculators in Programmer mode (e.g., Windows 10, Gnome calculator, KCalc) are better, but have some weird constraints. For example, both Windows calculator and KCalc only allow integers in Programmer mode, and Gnome calculator does not allow switching bases within the same calculation. And, oddly, all three calculators’ programmer modes are missing the typical scientific calculator functions. Programmers don’t science?

So I wrote yet another calculator.

The primary goal is to be able to switch between hexadecimal and decimal quickly while using keyboard (numeric keypad) input. There’s also a variant for touch devices that has more buttons on the screen and relies less on Shift and Ctrl modifier keys.

The calculator has both RPN and infix modes. I prefer RPN, but I suspect most people are more familiar with infix. Not surprisingly, writing the infix mode is many times harder than RPN. RPN mode simply performs each operation on the first two numbers on the stack immediately when an operation button is pressed. Infix notation needs to deal with operator precedence, operator associativity (right-associative vs. left-associative), unary vs. binary operators, and incrementally building and evaluating a syntax tree, along with a way to edit this tree to implement a backspace key. Some calculators even let you edit in the middle of an expression, but mine only allows deleting from the end.

For the GPU architects out there who stare at IEEE 754 numbers all day, there are buttons to convert the internal floating-point value to and from 32-bit or 64-bit IEEE 754 numbers.

The user interface is HTML, CSS, and JavaScript. The calculator itself is written in C++ and compiled to WebAssembly, with the GNU MPFR (multiple-precision floating point with correct rounding) library actually doing most of the arithmetic operations. All calculations are done client-side in the web browser.

Lenovo Thunderbolt Dock Type 40AC

Henry — Mon, 07 Aug 2023 12:20:21 +0000

The Lenovo Type 40AC dock is a Thunderbolt 3 dock that is capable of USB Power Delivery changing.

However, getting the 40AC dock to charge over USB PD requires at minimum a 135W AC adapter, regardless of how much USB charging power is actually being used. When using a smaller AC adapter (90W in my case), the dock will function but refuse to charge. The red LED on the Thinkpad logo will continuously flash if the AC adapter is too small.

The simplest solution would be to get a 135W or higher AC adapter. Instead, I chose to buy some short (15 cm) AC adapter extension cables and modify the resistor to make my 90W adapter appear as 135W. The cable I got came with a resistor indicating 90W, so it overrides the power capacity identification resistor of the connected AC adapter. But I needed to replace the resistor because I needed something different than 90W.

The laptop I’m charging won’t come close to exceeding 90W, so I should have plenty of margin left even with a smaller-than-required AC adapter. The laptop now happily charges through the dock.

Depstech DW49 Webcam — Teardown and Measurements

Henry — Wed, 06 Jul 2022 19:39:21 +0000

The Depstech DW49 is a low-cost webcam that claims to record 4K 30 fps video. I recently got one “used” (but seemed unused) from eBay for $30.

“4K” cameras at this price point often misrepresent their capabilities. One common scam is to use a lower-resolution image sensor and then scale up the image to 4K, resulting in a video with the right number of pixels, but a blurry picture. Despite the abundance of online reviews for this camera, I haven’t found any that verified the accuracy of the specifications claimed by Depstech. Since other reviews confirmed that this webcam is at least capable of a reasonable image, and $30 wouldn’t be overpriced even if the specifications weren’t accurate, I decided to buy one to see what’s inside.

I will try to find out what’s inside the webcam by disassembling it and making some measurements, but I’ll leave the more subjective image quality and “is it good?” opinions to the many reviews online.

TL;DR: Depstech does make exaggerated claims about the DW49 specifications, but the difference from reality is less than some other “4K” cameras I’ve seen.

Disassembly

The webcam itself is held together by five hooks (labelled with green arrows in the third photo) without external screws. It seemed slightly easier to start prying along the bottom edge because there are only two hooks, rather than three along the top edge.

Here are some photos of the circuit board and close-ups of some of the major components:

Front and back of the circuit board

SPCA2688A Image processor and flash memory

Camera module

Two microphones

Components

I wasn’t able to directly confirm the identity of the camera module or image sensor because there were no visible markings, and I didn’t want to remove the lens assembly to see whether there were any markings inside. I also couldn’t identify the “C013” microphone.

However, because the circuit board is labelled with “SH_4K_AF_V1”, I found the manufacturer of the webcam module circuit board (Sinoseen). It seems like the electronics are Sinoseen’s design, while Depstech designed a webcam around the circuit board. Sinoseen publishes specifications about the webcam module, which differ slightly from Depstech’s specifications about the camera. I find Sinoseen’s specifications credible, and I don’t have a reason to doubt them, but I have no way to directly verify their accuracy.

Component	Observed	Specifications from Sinoseen	Claimed by Depstech
Specifications link		Sinoseen (archived)	Depstech (archived)
Camera module	Sinoseen SH-4K-AF-V1
Image sensor		Sony IMX219 “Exmor R” 1/4″ (4.60mm diagonal, 1.12μm × 1.12μm pixel size) 3280×2464 (8.08 megapixel) 30 fps	Sony image sensor 1/3″ 8 megapixel 30 fps
Image processor	Sunplus SPCA2688A
Flash memory	Puya P25D40H 4 Mbit
Microphones	Unknown, marked “C013”
Field of view	65° – 85° (varies with resolution)	80° ±3°	80°
Focusing	Auto-focus: 5.2 cm – infinity Manual control: 4.3 cm – infinity	Auto-focus: 10 cm – infinity	Auto-focus

While Depstech’s claim of an 8-megapixel image sensor is probably correct, it omits mentioning that the sensor’s aspect ratio is different than 16:9 video, and that once cropped to 16:9, the sensor has fewer pixels than the 3840×2160 (8.29 MP) video it delivers. My guess is that 4K video is scaled up from a roughly 3048×1715 (5.2 MP) region of the image sensor. Depstech also claims that the image sensor is larger than the IMX219 (larger image sensors tend to have less noise, which helps in low-light conditions).

Image Sensor Resolution

It’s not difficult to measure the resolution of the image sensor and detect upscaling. Although the method isn’t highly precise, it’s good enough to confirm that the resolution is consistent with the claimed 3280×2464 resolution.

I created a variation of a Siemens star pattern, which has finely-spaced lines (technically, small sectors). Taking a photo of this pattern and observing the spacing of the most finely-spaced lines that can be seen in the final image reveals the highest spatial frequency that the camera (including lens, image sensor, image processing, and scaling) can capture. This gives a lower bound on the resolution of the worst-performing component in the system, which is often the image sensor itself if the picture is upscaled from a lower-resolution image sensor.

Image sensor resolution is lower than the image’s resolution, by about 27% in both directions. The white blob near the center is because the lines are so dense that my printer doesn’t print them properly.
(Cropped from 3840×3104 YUYV image with averaging across frames to reduce noise)

My Siemens star pattern is split into two halves: One with 1° per period (0.5° blank, 0.5° white), and the other at twice the density of 0.5° per period, so I have two different patterns to test. One interesting property of this pattern is that the Nyquist frequency (of the final image) always occurs at a fixed number of pixels from the center (229.2 pixels for 0.5°, 114.6 pixels for 1°), because the line spacing is proportional to the radius from the center point. At this distance, the lines are at a density that is the highest that could be represented by the image, where the widths of each black and white region are one pixel each. I look for the distance that has the highest density of lines (the Moiré patterns help in more precisely guessing the distance). When the line spacing is denser than the camera is able to capture, some lines start blurring together (so there are actually fewer visible lines), until the lines can’t be distinguished at all.

In my 3840×3104 image, the highest density lines were located about 296 pixels away from the center, which is 1.29 times as far as 229.2 pixels, which would be the distance if the maximum spatial frequency were limited only by the 3840×3104 image itself. Thus, this 3840×3104 image was likely captured by a (region of the) image sensor that had roughly 3000×2400 pixels ± a few percent. I suspect the 3840×3104 image does not use the full image sensor because the aspect ratios are different. If the full 2464-pixel height of the image sensor were used for this image size, the image sensor would need to be cropped horizontally from 3280×2464 down to around 3048×2464 before being upscaled to 3840×3104.

Although this method is somewhat approximate, it shows that the image sensor resolution used in this camera is quite close to the resolution of the Sony IMX219 sensor, even though I can’t verify whether the camera uses this exact sensor model. It also shows that the lens is capable of focusing well enough that a blurry lens isn’t the limiting factor in the camera resolution.

(I’ve also repeated this experiment with some of the scaled-down resolutions, and I get to within 1% of the expected 229 pixels as the distance with the finest lines, which gives confidence that this method is measuring what I think it should be measuring. When images are scaled down, the maximum spatial resolution becomes limited by the lower resolution of the image, rather than the image sensor.)

Recording Resolutions and Frame Rates

The camera’s USB descriptors advertise support for 11 resolutions and two image formats (YUYV and MJPEG). However, some of the less commonly-used resolutions don’t work at all. They can be selected, but returns no video frames, which can confuse the Windows Camera app enough that it stops allowing resolution changes for the camera, leaving the camera stuck in an unusable resolution setting. Some of the resolutions have a different frame rate than claimed.

I also measured the image bitrate to get a sense of how much image compression there is.

Resolution	MJPEG				YUYV
	Claimed	Actual			Claimed	Actual
	fps	fps	Bitrate (Mbps)	Bits per pixel	fps	fps	Bitrate (Mbps)	Bits per pixel
3840×3104	15	15	152	0.85	1	0.75	143	16
3840×2160	30	29.97	162	0.65	1	1.25	166	16
3264×2448	30	— Failed —			1	— Failed —
2592×1944	30	30	166	1.10	1	— Failed —
2048×1536	30	30	115	1.22	1	— Failed —
1920×1080	30	29.97	75	1.20	5	— Failed —
1600×1200	30	30	72	1.24	10	— Failed —
1280×720	30	29.97	36	1.30	10	7.5	110	16
1024×768	30	30	30	1.27	10	— Failed —
640×480	30	30	12	1.30	30	30	147	16
320×240	30	30	3.4	1.50	30	30	37	16

This camera uses a USB 2.0 High Speed interface, which can achieve about 400 Mbps in bulk transfer mode. However, UVC cameras use isochronous transfers, which are about half the throughput of bulk transfers, but have reserved bandwidth to avoid dropping frames when there is other USB traffic competing for bandwidth. USB High Speed isochronous transfers have a theoretical limit of 196.6 Mbps. The table above shows that every resolution chooses frame rate and/or compression to keep the video bitrate somewhat below this limit.

For uncompressed YUYV format (fixed at 16 bits per pixel), the bandwidth limit explains why frame rate drops very quickly with increasing resolution.

MJPEG uses lossy JPEG compression of each video frame to reduce size and allow higher frame rates. The amount of compression (and quality loss) can be estimated by computing the average number of bits of information used per pixel. Most of the lower resolutions use around 1.2 – 1.3 bits per pixel (around 13x smaller than YUYV), but the highest three resolutions have much higher compression (lower bits/pixel) to keep the bitrate within USB 2.0 limits (only 0.65 bits per pixel at 3840×2160). With MJPEG compression, the limited bandwidth of USB 2.0 requires substantial compromises in image quality at 4K 30 fps resolutions.

Sadly, the 3264×2448 resolution (the resolution that is closest to the native sensor resolution) does not work in either MJPEG or YUYV modes.

Field of View

Both the webcam manufacturer (Depstech) and webcam module manufacturer (Sinoseen) specify a field of view of approximately 80° (though, oddly, the manual that comes in the box says 55°). I tested the field of view and found that it varied between 65° and 85° (diagonal) depending on resolution, indicating that some resolutions use a cropped region of the image sensor.

I pointed the webcam directly at a wall 1.5 m away, then marked the locations on the wall that corresponded to the left, right, and bottom edge of the image frame at each resolution. The locations of these marks are measured with a tape measure. The field of view can then be calculated using some simple trigonometry. The field of view varies slightly depending on focus (narrower when focusing nearer).

Measuring field of view
(Yes, I had the camera upside-down when doing this experiment)

The largest 3840×3104 resolution has the widest field of view at 85° diagonal. The two 16:9 resolutions (except 1920×1080) have a smaller field of view because it’s just a cropped version of 3840×3104. The 4:3 resolutions are, unsurprisingly, cropped differently than the 16:9 resolutions.

The 1920×1080 resolution is strange. Despite being the same 16:9 aspect ratio as both 3840×2160 and 1280×720, it uses an even smaller cropped region of the image sensor, which results in a smaller field of view of 65° for this resolution. Even more strange is that, unlike all of the other resolutions, 1920×1080 is vertically stretched by about 4%: The image size in pixels is 16:9, but the physical dimensions that’s captured in the image is a little bit shorter, at 16:8.62.

Resolutions	Horizontal FoV	Vertical FoV	Diagonal FoV	Pixel aspect ratio
3840×3104	71°	60°	85°	1.0
3840×2160 1280×720	71°	44°	79°	1.0
All 4:3	62°	49°	74°	1.0
1920×1080	58°	33°	65°	1.044

The claimed 80° field of view is quite accurate for the 3840×2160 and 1280×720 resolutions. Having a different field of view for 4:3 aspect ratio images isn’t surprising due to cropping, but the 1920×1080 resolution’s being cropped differently (and unevenly) is puzzling.

Focus

The camera supports auto-focus, but it is also possible to turn off auto-focus and manually set the focus (in software). Manually setting the focus to the nearest setting results in focusing at 43 mm from the camera. However, the auto-focus algorithm seems to be unwilling to focus any closer than 52 mm. This likely isn’t much of an issue in practice, because wecams aren’t usually used at such short distances. Notably, the focusing capability exceeds the specified minimum of 100 mm.

Subjectively, auto-focus works, but is somewhat slow, and sometimes hunts around repeatedly trying to adjust the focus. In my very limited use so far, I have not found this to be a problem, but auto-focus can be disabled if it does become a problem.

Below is a cropped image of the webcam looking at its own box in focus at a distance of 43 mm.

Cropped image in focus at 43 mm distance

Microphones

There are two microphones, and the webcam records in stereo, but the two microphones are not directly mapped to the left and right channels of the stereo recording. Playing a sound in one microphone results in the sound appearing in both recorded channels. However, there is some stereo imaging. There is some audio processing in between the microphones and the recorded stereo audio data, but I haven’t investigated further on what this processing is doing.

Summary

Does the Depstech DW49 deliver video with full 4K (8.3 MP) resolution? No, it does not. But the real image sensor resolution of approximately 3048×1715 (5.2 MP) when recording 4K video is still a substantial improvement over Full HD’s 1920×1080 (2.1 MP), at a price point that is not that different from Full HD cameras. While Depstech isn’t entirely honest with its specifications (overstating the sensor size and omitting that the “4K video” only uses 5.2 million sensor pixels), it’s less of a scam than some other low-cost “4K” cameras out there (I’ve seen one upscaling from a 4 MP sensor…)

If you’re happy with 5.2 MP, this is still a budget product. The USB 2.0 High Speed interface does not have enough bandwidth for the highest resolutions, which results in higher compression (lower image quality) to compensate. There are also are odd quirks, such as claiming to support resolutions that do not actually work, and the cropped and slightly stretched 1920×1080 resolution. As long as you understand these limitations and know how to work around them, I don’t think they cause major problems in typical use.

Overall, not quite as advertised, but not bad at this price point.

Capacitors in storage can get the plague too

Henry — Tue, 22 Feb 2022 09:30:12 +0000

Five Nichicon HM capacitors from 2005 (one swollen) on a MS-5184 motherboard from 1999.

Capacitor plague. Swollen and leaking electrolytic capacitors. A well-known phenomenon since the early 2000s.

Caused by heat, ripple current, unknown brands, or Taiwanese origin? Not necessarily.

Back in 2006 (16 years ago), I bought some Nichicon HM series electrolytic capacitors to replace swollen capacitors on an old PC motherboard. I recently noticed that the replacement capacitors were swollen again, yet the motherboard had been almost unused since the last capacitor replacement. I still have most of the bag of 200, so I decided to investigate. Although they sat in a box unused, 70% of them were swollen…

Background: Capacitor Plague

Electrolytic capacitors sometimes fail by swelling. This was particularly common in the early 2000s and is often referred to as capacitor plague. These capacitors would generate gas and swell as they age, and eventually rupture and cause the failure of whatever electronic device was using it (often a power supply of some sort).

I had assumed that this only happened while in use, either holding a charge, or something to do with ripple current. But this assumption is incorrect, because I now have a large number of never-used (stored at room temperature) capacitors that swelled, including some that swelled enough to leak.

One of the well-known aging properties of electrolytic capacitors is that the oxide dielectric can become thinner if the capacitor is stored uncharged. This results in increased leakage current and increased capacitance. This is supposed to be reversible within some minutes by charging up the capacitor, which causes the oxide layer to grow back, in a process called reforming. But swelling and rupturing goes well beyond just a reversible increase in leakage current.

Starting in the late 1990s, a new generation of low-ESR capacitors began to use water-based electrolyte. Water-based electrolytes have the benefit of low-ESR that allows higher ripple currents, but comes with a risk of corrosion because water can react with aluminum and cause corrosion and hydrogen gas if corrosion inhibitors are not used correctly . This corrosion is likely the cause of the capacitor plague. The Nichicon HM is a low-ESR capacitor from the affected time period (early 2000s) and seems to be failing with similar symptoms, so it’s likely that these failures are also due to the use of a water-based electrolyte without sufficient corrosion inhibitors.

Similar symptoms have been observed before (increased capacitance and leakage before rupture), and Hillman and Helmold performed a chemical analysis of the failing capacitors, finding dissolved aluminum in the electrolyte . In addition to the known symptoms, I observed that my corroding capacitors appear to be turning into batteries and self-charging with a negative voltage, which does not appear to have been previously reported.

Capacitors tested

Nichicon HM

Nichicon HM 6.3V 1500µF, manufactured 2005 week 11.

Nichicon HM 6.3V 1500µF (2005 week 11). Made in Japan.

Nichicon is a well-known Japanese manufacture of capacitors. Their HM series is an aluminum liquid electrolytic capacitor designed with low impedance (low-ESR) for use in PC motherboards. The HM and HN series were known for early failure. Although there were some claims that the problems were fixed in 2005, week 11 was likely still bad given how many failures I have.

Control group: Nichicon HZ and Kemet 759

I did measurements on two other capacitor models (that had no problems) to use as a control group.

Nichicon HZ 10V 1000µF (2012 week 6). Advertised as a lower-impedance version of the HN series, which is a lower-impedance version of the HM.
Kemet 759 10V 1500µF (2020 week 44). This is a solid polymer capacitor.

First attempt at basic measurements: Failed

I started by measuring the capacitance with my multimeter. This failed on all the capacitors, because the leakage was too high. A capacitance meter measures how long it takes to charge and discharge the capacitor, but if the leakage is too high, it will never charge enough to get a measurement.

The next thing to try was to charge the capacitor up to 5V through a current meter. Bad idea. The leakage current reached at least 1.5A (at least for a short moment), and the capacitor heated up and swelled very quickly. It’s good that I didn’t try powering on the motherboard, because this much leakage current could have caused some real damage.

In the remainder of the article, I visually sort the capacitors by whether it is swollen, then measure open-circuit voltage, charge up the capacitor in an attempt to reform the dielectric oxide layer, then measure capacitance and open-circuit voltage again.

Visually sorting by swelling

I sorted the capacitors visually by whether they appeared swollen. This allows later measurements to compare the differences between capacitors that have visibly swollen with those that have not.

The capacitors were sorted into three groups: “bloated”, “slightly bloated”, and “not bloated”. There were 104, 12, and 53 in each group, respectively. The “slightly bloated” group is for borderline cases where the capacitors don’t look swollen, but also didn’t look exactly the same as “not swollen”. Although 31% of the capacitors show no bloating, the later tests show that these are unusable because they do not survive being powered on without swelling and rupturing after a short time.

The photo below shows three examples from each group.

Nichicon HM capacitors sorted by bloating.
Left: Bloated. Middle: Slightly bloated. Right: Not bloated.

Open-circuit voltage

When a capacitor is left unused for many years, the expectation is that it will not be charged and have zero open-circuit voltage.

An interesting observation is that these Nichicon HM capacitors will charge themselves with tens to hundreds of millivolts, even after being discharged/shorted for a long time. Even more strangely, all of them self-charged with a negative voltage (the positive terminal of the capacitor became more negative than the negative terminal), so this is not the usual dielectric absorption that would cause the capacitor to self-charge with a positive polarity if it had been recently charged and discharged.

The following chart plots S-curves that show the distribution of the open-circuit voltage for the three bloatedness groups. Two other capacitor models were used as control groups for comparison. All capacitors have never been used.

The chart plots the open-circuit voltage measured for every capacitor in each group, sorted by voltage within the group. Each data point is one capacitor. An S-curve chart makes it easy to see the median value, how many outliers there are, and how much they vary from the median. The number of data points for each series varies depending on how many capacitors are in each group (I have more Nichicon HM capacitors than the other types).

Open-circuit voltage for the three groups of Nichicon HM, and two other capacitor models for comparison

This plot includes these capacitors:

169 Nichicon HM capacitors, sorted into bloated (104), slightly bloated (12), and not bloated (53) groups
Two models as control: Nichicon HZ (18), and Kemet 759 (10)

The median Nichicon HM capacitor had -262 mV, -186 mV, and -65 mV of open-circuit voltage in the bloated, slightly bloated, and not bloated groups, respectively. The big differences in the open-circuit voltage that correlate with the bloatedness of the capacitor suggest that this open-circuit voltage is related to the bloating.

Compared to the failing HM capacitors, both of the other models have very low voltage (about 1000x lower, at +0.1 to +0.3 mV), and have positive polarity. The positive polarity and small magnitude might be explained by a small residual amount of dielectric absorption.

An electrolytic capacitor consists of an anode (with thin oxide dielectric), a cathode, and electrolyte. A battery has a similar construction, except without the dielectric and different materials for the two electrodes. The presense of a voltage on the capacitor makes me suspect that the electrolyte reacts with the anode/cathode and has dissolved enough of the oxide to turn the capacitor into a weak battery, which leads to corrosion of the anode/cathode, the production of gas, and the generation of a voltage.

Perhaps a capacitor’s open-circuit voltage could be an early predictor for whether the capacitor will fail by corrosion, where a capacitor that charges itself is a sign of an electrochemical reaction that leads to corrosion. Even the non-bloated HM series capacitors had significant negative voltage, while the other two models all had much smaller positive voltages. Futher measurements of other capacitor models and measurements on new capacitors before corrosion begins (which is difficult to do because nobody knowingly makes corroding capacitors and advertises that fact) will be necessary to see whether self-charging is a useful indicator of corrosion on other capacitor models, and whether it shows up early enough in a bad capacitor’s life to detect corrosion in new capacitors (perhaps with high-temperature baking) before the more obvious symptoms show up.

Charging up the capacitors: Reforming

Aluminum electrolytic capacitors in storage are known to have increased leakage current due to thinning of the dielectric layer, which can be reversed by reforming. Nichicon’s application guidelines recommend charging the capacitor to its rated voltage through a resistance of 1 kΩ for 30 minutes if the capacitor has been stored for more than two years . Perhaps reforming could repair even a serious case of corroded dielectric?

The next chart plots the charging current over time when I tried charging the capacitor using 5V through a 1 kΩ resistor (note the logarithmic time scale). Because the capacitor is charging using 5V through a 1 kΩ resistor, the charge current is limited to at most 5 mA when the capacitor voltage is zero. Charging through a resistor also means the capacitor voltage is not constant, but slowly rises as the charge current decreases.

Charging with 5V through a 1kΩ resistor. All of the Nichicon HM capacitors leak current abnormally, but the bloated ones leak more. Plotted for comparison is an ideal 1500µF capacitor and a measurement of a new Kemet 1500µF capacitor.

I tested 10 samples, five each from the bloated (red) and non-bloated (blue) groups. The chart shows a clear difference between the bloated and non-bloated samples, with non-bloated samples having lower leakage current that decreases more quickly. It seems that more corrosion causes more swelling and more leakage current, and the non-bloated capacitors are in better condition.

But this is far from saying that the non-bloated capacitors are behaving normally. The datasheet says leakage current should be less than 0.03CV within two minutes, which is 0.225 mA for 1500µF at 5V. The test sample that reached this value quickest took almost 8 hours, and the longest one took more than 14 hours.

For comparison, the theoretical charging curve of an ideal 1500µF capacitor and a measurement of a Kemet 759 1500µF capacitor are plotted (black). The measurement of the good capacitor agrees fairly well with theory. It is obvious that the non-bloated bad capacitors are far from normal.

Even worse than abnormal leakage currents, all of the non-bloated capacitors swelled while being charged, and at least several ruptured. Here is a time lapse of non-bloated sample 5, which began to swell visibly at half an hour, and ruptured at 2.5 hours. Even the capacitors that look normal are still unusable.

Time lapse video of 8.5 hours while charging non-bloated sample 5 with 5V through a 1kΩ resistor. Swelling starts at half an hour, rupturing at 2.5 hours.

Post-reforming measurements

	Bloated		Not bloated
	µF	mV	µF	mV
Sample 1	OL	-239	3171	-163
Sample 2	OL	-306	3392	-5
Sample 3	OL	-529	3531	-22
Sample 4	OL	-519	3517	-8
Sample 5	OL	-450	3224	-23

Capacitance

Charging up the capacitors for a day reduced their leakage to reasonable levels, which would allow capacitance to be measured. The capacitors were measured after sitting for a few days after charging. All of the bloated capacitors had already developed leakage currents high enough that their capacitance couldn’t be measured by my multimeter. The non-bloated ones seemed to retain the low leakage state longer, but measured over twice their 1500µF rated capacitance. The high capacitance measurement is probably due to the dielectric layer being too thin, or that corrosion had roughened and increased the surface area of the anode aluminum foil.

Open-circuit voltage

The magnitude of the open-circuit voltage of the bloated capacitors increased after being charged for a day (none exceeded -500 mV before reforming, but two exceed it after), while it decreased for most of the non-bloated capacitors. I can imagine that reforming the insulating oxide layer could slow down corrosion by protecting the aluminum foil, but it seems to have made corrosion even worse for the capacitors that already started out worse.

The fact that charging the capacitors reduced the negative open-circuit voltage (a proxy for corrosion rate?) suggests that these capacitors may have survived longer had they been in use (kept charged) rather than stored.

Conclusions

When electrolytic capacitors swell, it is not necessarily due to heat, ripple current, or power-on time. The low-ESR Nichicon HM capacitors appear to be internally corroding even in room-temperature storage. Within the same batch, there is variation in how much corrosion and swelling has occurred, and the amount of swelling seems related to how much the capacitor has deteriorated (as measured by leakage current). However, even capacitors with no visible swelling are clearly unsafe to use. If they are charged to their normal operating voltage, they will have high leakage currents (possibly enough to damage other parts of the circuit), swell, and leak, even if charged slowly.

There have been various claims about the Nichicon HM failures that distance it from the rest of the capacitor plague, including the claim that the cause of failure was “overfilling” the electrolyte. My measurements here suggest that the failure mechanism is the same as the rest of the capacitor plague: Aluminum anode and cathode foils corroding in electrolyte, high capacitance, high leakage current, self-charging, gas generation, and eventual rupture. Further measurements will be necessary to see whether self-charging voltage could be used to detect corrosion before swelling begins.

It’s time to replace the capacitors on my 23-year old motherboard a second time, this time with solid polymer capacitors.

References

Portland Black Lives Matter Mural

Henry — Sun, 16 Aug 2020 06:53:08 +0000

During the protests across the United States following George Floyd’s killing in May 2020, “Black Lives Matter” murals were painted on streets in various cities around the country. There was also one in Portland, Oregon, on N Edison Street (It has a website: https://www.edisonstreetmural.com/).

One interesting aspect of the mural on Edison Street is that in addition to the usual “Black Lives Matter” in large yellow letters, it also includes brief snippets of history related to the neighbourhood written into each letter. I rather like the idea. It is no longer just a slogan, but also educates and tells a story. The Edison Street mural’s website has more information on each of these stories.

The small writing in a big mural poses a challenge for photographing it. An aerial photo from a drone would capture the entire mural, but would be too far to see the writing. Photographing up close can capture the writing, but is too close to see the big picture. I thought it would be possible to take many pictures up close then stitch them together, to get both the overall picture and details in a single image. Below is the result — A 2.7-gigapixel image.

This composite image was taken over two separate days (2020 June 28 and July 25), because there were some changes to the mural after my first set of photos. Due to laziness, I only photographed the modified portions on July 25. I also added some more of the grass above and below the main text during my second visit. In the final composite, there are 183 photos from June 28, and 434 photos from July 25.

The panorama is a composite of photos taken June 28 (green, 183 photos) and July 25 (purple, 434 photos)

Photographing and Stitching the Panorama

Photos were taken with a Canon S95 (10 megapixel), hand-held, pointing approximately down. The photos are taken with enough overlap between photos to avoid any gaps, even after masking out the region blocked by my feet. Below is an example of 30 photos from a small portion of the mural.

The photos were then stitched using Hugin. This took a lot of effort (weeks). Control points (pairs of points on different images that mark the same real-world point) are added between pairs of overlapping images, then an optimizer (gradient-descent based?) is run to find a transformation of each photo to make the control points line up as closely as possible.

While stitching panoramas of a few tens of pictures is easy, long panoramas more than a hundred photos wide is more difficult. Any errors in aligning adjacent images accumulate over the length of the panorama and the panorama no longer remains straight. Artificial constraints were added to straighten it. There are small alignment mismatches between images. When the mismatches occur within the small text, they are very noticeable and require manual adjustment. On top of that, just the sheer size of the panorama means there are many images to stitch (over 600), with many control points (about 37,000), and every operation is slow.

After seeing the mural in the news, I went to take pictures without yet knowing how to stitch them. This resulted in taking a set of photos that were harder to stitch than necessary.

Adjacent photos should have a large (50%?) overlap. Stitching photos requires finding control points, and the automatic algorithms to do this don’t work when the overlap is too small. For the first set of 285 photos, I ended up adding most of the control points by hand because there was not enough overlap. Having realized my mistake, the additions on July 25 had more overlap, with 305 new photos to replace an area about one-third of the original, which allowed most of the control points to be found automatically. Sadly, I made the same mistake again when photographing the top and bottom edges (139 images) thinking that I wouldn’t care much if the grass and gravel areas didn’t align as well as in other parts of the mural. That may have been true, but I had to add all the control points manually again.
If there are small regions that would be ugly if misaligned, try to include the entire section in one photo to avoid or reduce the number of seams in highly-noticeable areas. If a paragraph of text can fit entirely within one image, the seam can be outside the text and be less noticeable. Failing that, a seam between two lines of text is less noticeable than a seam that cuts in between words of the same line, which is better than cutting between letters of the same word.

Elections 2019: Who is really getting a tax cut?

Henry — Sat, 19 Oct 2019 09:22:28 +0000

It’s Canadian federal election season again, and that means party platforms and arguments about tax cuts, deficits, and how much each campaign proposal will cost. As usual, the Office of the Parliamentary Budget Officer (PBO) provides independent estimates of the cost of campaign proposals.

But the cost to government is only one half of the equation. To whom does this money go?

Why might this matter? Maybe you think the increasing income inequality in our society is a problem. Or maybe you think our society needs more income inequality to incentivize people and capital to be more productive. In any case, I think it’s interesting to see who these election tax policy proposals are really benefitting or costing. After all, almost everyone can be “middle class” if the definition is sufficiently vague.

In this article, I look at personal income tax proposals from four parties (Conservative, Liberal, NDP, and People’s Party), and how the benefit or cost is distributed among people at varying income levels. I try to emulate the methodology the PBO uses in its cost estimates, and then break down the financial impact by income group (a “distributional analysis”). Both the PBO and I use Statistics Canada’s Social Policy Simulation Database and Model (SPSD/M) to model tax system policy. And according to the SPSD/M license agreement, I must make the following statement:

This analysis is based on Statistics Canada’s Social Policy Simulation Database and Model. The assumptions and calculations underlying the simulation results were prepared by Henry Wong and the responsibility for the use and interpretation of these data is entirely that of the author.

Summary

	Fed. Govt. Cost (2023, $M)	Fed. Govt. Cost (PBO FY 2023-24, $M)	50% of impact shared by top x% individuals
Conservative Party	5748	5890 (link)	27%
Liberal Party	5390	5634 (link)	36%
New Democratic Party	-1279	-964 (link)	0.1%
People’s Party	37899	n/a	6%
Green Party	Too complicated	(link)

Of the four proposals, three are tax cuts (positive cost to government), while the NDP proposes a tax increase (negative cost). The table above includes the PBO estimates as a sanity check. My estimate of the cost to the federal government should be close to PBO’s estimate. I chose to look only at year 2023, which is the year when the Conservative income tax rate reduction is fully phased-in. The People’s Party does not appear to have submitted any estimate requests to the PBO.

The Conservative and Liberal proposals are surprisingly similar. Both offer about $5.5B/yr of tax cuts, targeted at roughly the highest-income 40%–50% of individuals. The Conservative proposal gives more money to fewer people than the Liberal proposal ($340 to the top 30% of individuals vs. $250 to the top 50%), but neither proposal provides much benefit to the lower third of the income distribution.

The NDP proposes increasing the top tax bracket rate from 33% to 35%. Very few people have enough individual income to be affected by the top tax bracket, so half the burden of this tax increase falls on the highest-income 0.1% of individuals.

The People’s party proposes an increase to the Basic Personal Amount, lower tax rates, and elimination of personal capital gains tax. The net result is a high cost ($38 billion/year), with half of the money going to the top 6% of income earners.

Methodology

I used Statistics Canada’s SPSD/M to estimate the static impact of personal income tax policy changes. I break up the population into income groups, then plot a graph showing the financial impact to each income group.

For each proposal, I plot four graphs. The first two are based on individual income, while the last two are based on per-person family income. In all graphs, the horizontal axis is a ranking of persons, sorted by (pre-tax) income: Low-income on the left, high income on the right. The first set of graphs uses individual income, with non-tax filers excluded (31.6M estimated tax filers in 2023). The second set sorts persons (including non-tax filers) by the average per-person income of their family, because families typically share incomme among all family members. In this model, whether a child with no income is a “low-income” or “high income” individual depends on family income, not the child’s individual income. A single person with $1000 income is ranked the same as both people in a family of two that has $2000 total family income.

Within each pair, the first graph shows the average change in “consumable” (post-tax) income per person in each income group. This graph shows how much money each person gets (or pays) as a result of the policy change. The second graph shows the cumulative cost to the federal government, with the same horizontal axis as the first graph. The vertical axis is scaled to the total cost of the proposal. This graph can be used to see how the pool of money is distributed between different income groups. The second graph is close, but not exactly the same as summing over the first graph, because changes in federal income tax policy affects the amount of other taxes (provincial and federal sales taxes) collected.

Detailed Results

Conservative Party of Canada: Personal Income Tax Rate Reduction – 32644536

This policy lowers the tax rate of the first income tax bracket from 15% to 13.75%, along lowering the value of non-refundable tax credits from 15% to 13.75%. (In SPSD/M, these are the FTX and FNTCR parameters)

Individual Income

Although “lowest tax bracket” sounds like it would benefit most people, in fact, nearly all of the benefit goes to the highest-income half of the tax-filing population. There are many tax filers with low income, and only taxpayers with taxable income beyond the upper end of the first tax bracket ($51667 in 2023) will see the full benefit of the tax cut. High-income individuals still see a benefit on the portion of their income that falls within the lowest tax bracket, so the per-person benefit flattens out at around $340/year at high incomes.

The cumulative chart shows that the lowest 50% of the tax-filing population only share about 12% of the new government “spending”.

Family Income

Families often consist of both high-income and low-income members (e.g., working parents and zero-income children), and the wealth of each member depends more on family income rather than the individual’s own income. For example, a tax policy that targets “low income people” doesn’t necessarily need to target low-income individuals (such as a child in a wealthy family), but should target individuals in low-income families. When sorting by per-person family income, the graph is more spread out horizontally, reflecting the diversity of families: There are single-person families, and multi-person families where there can be zero, one, or more income earners.

When plotted by family income, the trend is the same, but less extreme. The tax cut still mainly helps families with high income. The lowest-income half of people share about 20% of the new government spending.

Liberal Party of Canada: Basic Personal Amount Increase – 33121046

This policy increases the Basic Personal Amount (non-refundable tax credit) from $13092 to $15000 for those with taxable income below $160137 ($150605 in 2020 scaled by CPI inflation until 2023), and to $14046 for taxable income below $228136 (scaled from $214557 in 2020). In SPSD/M, the BXM parameter was changed. Although the PBO used SPSD/M in “glass box” (custom code) mode, I achieved the same result by running the simulation in “black box” mode three times (with different BXM values for taxable incomes falling into each of the three ranges) and summing the results.

Individual Income

Increasing the Basic Personal Amount benefits more people than decreasing the tax rate of the lowest tax bracket because a taxpayer requires less income before getting the full benefit. One difference in the Liberal proposal is that the increased tax credit does not apply to those with high incomes: the top 5% of income earners see reduced or no benefit from this proposal.

Family Income

About 2/3 of the benefits go to the highest-income half of families, though the top 4% of families see a reduced benefit.

This illustrates the difficulty of targeting the lowest-income third of the population using tax policy. This income group pays no or little tax, so they are unaffected by tax rate or tax credit changes.

New Democratic Party: Change in federal tax rate for high income earners from 33% to 35% – 32630416

The NDP proposed increasing the tax rate of the top tax bracket from 33% to 35%. (In SPSD/M, this is the FTX parameter.) Since the top tax bracket affects so few people, I plotted graphs with the horizontal axis starting at 0.9 (top 10% income) to keep the graphs readable.

Individual Income

There is essentially zero impact to about 98.5% of tax filers because they do not have enough income to reach the top tax bracket.

The PBO analysis of the cost to government includes an effect for an Elasticity of Taxable Income for high-income earners of 0.38. The elasticity is an estimate of how much the reported taxable income changes when the marginal tax rate changes, which tends to decrease the revenue generated when the tax rate is increased. For example, increasing the marginal tax rate might encourage people to work less (leisure is worth relatively more), or increase tax avoidance efforts or tax evasion, all of which reduce the reported taxable income.

I attempted to use the same methodology to account for ETI, but could not get a similar number as the PBO (I got $1279M in calendar year 2023 including ETI, while the PBO analysis expects $964M in fiscal year 2023-24). If anyone knows how the PBO arrived at their estimate, I’d like to know.

I plotted the increase in federal government revenue including ETI in red (the rest of the plots assume the tax base doesn’t change in response to tax rate, or ETI=0). Because the ETI is assumed to be fairly high for high-income earners (the PBO assumes 0.38 for high-income, but 0.10 for low income), and the large amounts of per-person income involved, the ETI effect was not negligible for this policy. Paradoxically, there are some taxpayers (those near the lower bound of the highest tax bracket) where increasing the tax rate actually causes a decrease in tax revenue collected, because the behavioural effect (reducing taxable income) exceeds the amount generated by the higher tax rate (+2% on income above $228,200).

Family Income

The trends are the same: The burden of the tax increase is almost entirely carried by the top few percent of families, with almost no effect on everyone else.

People’s Party of Canada: Cutting Income Taxes and Abolish Personal Capital Gains Tax

The People’s Party proposes several major changes to the income tax system:

Increase the basic personal amount to $15000 (SPSD/M parameter BXM)
Decrease tax rates to 15% for the first $100,000 of income, and 25% for the rest (SPSD/M parameter FTX)
Phasing out the personal capital gains tax (SPSD/M parameter CAPGIR=0)

This tax policy can be seen as approximately the opposite of the NDP proposal. It proposes a large government expense that mostly benefits the highest income few percent of the population.

Individual Income

The total cost of the proposal (neglecting ETI) is about $38 billion in 2023, or about 7× bigger than the Liberal or Conservative tax cuts. Unlike all of the other proposals, the benefits overwhelmingly go to the highest-income individuals: Half the benefit goes to the top 6% of individuals.

Most of the cost ($31B) is due to the income tax cut. Eliminating the capital gains tax costs another $7B on top of that. Eliminating the capital gains tax on its own would have costed $13.1B without the income tax cut, because the income tax cut reduces the value of also removing the capital gains tax.

As one might expect, capital gains tax is mainly paid by those with a large amount of capital to invest. About 40% of the benefit of eliminating the capital gains tax goes to the top 2% of high-income individuals.

Family Income

The story is the same when looking at family income per person instead of individual income. About half the benefit of the tax cuts go to the top 7% highest-income families.

Green Party: Raising the Inclusion Rate of Capital Gains for taxation – 32631069

The Green Party’s proposals that affect personal income tax are too complicated for me to model. See the PBO’s estimate for a description of the policy proposal. The PBO’s model isn’t simple, contains some questionable assumptions, and the PBO thinks their estimate has a “high uncertainty” as a result.

Income Distribution

All of the above graphs used persons sorted by income on the horizontal axis, rather than income. How does this map to income? The following two graphs show the pre-tax income distribution (in 2023), with the first one sorted by individual income (excluding non-tax filers) and the second sorted by family income per person. (The SPSD/M variable is imictot.)

The median tax-filing individual has about $46,000 of pre-tax income, while the median family has about $36,000 of pre-tax income per person.

Thoughts

I think a moderate decrease in income inequality would be a good thing. Thus, I prefer tax policy changes that don’t predominantly benefit high-income individuals.

Both the Conservative and Liberal proposals provide a moderate tax cut for the high-income 40%–50% of the population. While these may not reduce income inequality, I think it is partly forgivable. It’s hard to design a tax cut (which attracts votes) that also targets low-income individuals (who already pay little tax and aren’t affected by income tax changes).

In contrast, the People’s Party’s large tax cuts are clearly aimed at increasing the income of a small fraction of high-income individuals.

The NDP’s proposal to increase the tax rate of the highest tax bracket does move in the direction of reducing income inequality. But the increased tax burden so narrowly targets the top 0.1%–0.5% of income earners that it risks being seen as punitive, rather than useful.