Depstech DW49 Webcam — Teardown and Measurements

The Depstech DW49 is a low-cost webcam that claims to record 4K 30 fps video. I recently got one “used” from eBay for $30, so I can find out what’s inside by disassembling it and making some measurements.

“4K” cameras at this price point often use a lower-resolution image sensor and then scale up the image to 4K, resulting in a video with the right number of pixels, but a blurry picture. The DW49 also upscales 4K video (8.3 MP) from a lower-resolution sensor region (5.2 MP), but the difference from reality is less than some other “4K” cameras I’ve seen, and is a step up from Full HD (2.1 MP). Overall, not quite as advertised, but not bad for the price.

. . . → Read More: Depstech DW49 Webcam — Teardown and Measurements

Capacitors in storage can get the plague too

Capacitor plague refers to defective capacitors that would generate gas, swell, and rupture with age. I bought some Nichicon HM series electrolytic capacitors (made in 2005) to replace plague-afflicted capacitors on an old PC motherboard. These replacement capacitors (apparently also defective) have now swollen again and some ruptured. Out of 169 “new” capacitors still in the bag, almost 70% of them were swollen, despite never having been used…

So the next logical thing to do is to experiment on them: Take some measurements, charge them up, and see how the capacitors respond, why they failed, and whether any of them are still usable. (Spoiler: None are still usable.)

. . . → Read More: Capacitors in storage can get the plague too

Discovering Hard Disk Physical Geometry through Microbenchmarking

Modern hard drives store an incredible amount of data in a small space), with billions of sectors (with thousands of defects), packed into hundreds of thousands of tracks spaced tens of nanometers apart, arranged onto a stack of platters spinning around at a high speed. Which drive characteristics can be discovered using microbenchmarks?

This article describes several microbenchmarks that try to extract the physical geometry of hard disk drives, and a few other related measurements, including rotation period, the physical location of each sector, track boundaries, skew, seek time, and locations of defective sectors. I use these microbenchmarks to characterize a variety of hard drives from 45 MB (1989) to 5 TB (2015).

. . . → Read More: Discovering Hard Disk Physical Geometry through Microbenchmarking

The Microarchitecture Behind Meltdown

Since the recent (Jan. 2018) disclosure of the Meltdown vulnerability, there has been a lot of interest, speculation, and hysteria, but not a particularly good understanding of the processor microarchitecture feature responsible for it. Understanding of the root cause of the vulnerability allows one to understand why only some microarchitectures are affected, and allows reliably testing for the existence (or, even harder, the non-existence) of the vulnerability on various processors, instead of relying solely on vendor self-reporting (or worse, speculation…).

This article first defines the microarchitectural mechanism that allows Meltdown to work, then develops a microbenchmark to specifically test for this behaviour on multiple microarchitectures.

. . . → Read More: The Microarchitecture Behind Meltdown

PET is hygroscopic: Water diffuses out of a Sprite bottle

Polyethylene terephthalate (PET) is a plastic commonly used to make beverage bottles. PET is hygroscopic . Therefore, if you have a bottle of Sprite, you would expect the water to be absorbed by the plastic, diffuse through the bottle, then evaporate outside the bottle.

And that’s exactly what happens.

. . . → Read More: PET is hygroscopic: Water diffuses out of a Sprite bottle

Microbenchmarking Return Address Branch Prediction

Modern processors use branch predictors to predict a program’s control flow in order to execute further ahead in the instruction stream. Function return instructions use a specialized branch predictor called a Return Address Stack (RAS), Return Stack Buffer (RSB), return stack, or other various names. This article presents a series of increasingly complex microbenchmarks to measure the behaviour of the RAS found in several Intel and AMD processor microarchitectures. . . . → Read More: Microbenchmarking Return Address Branch Prediction

Cyclone/Stratix V Carry-Lookahead…bug?

New in the Cyclone V and Stratix V families, there seems to be carry lookahead at the LAB level (10 ALMs or 20 bits of addition), which can be seen by looking at the timing path through a big adder. But if there are two unrelated adders in a circuit, the carry lookahead is disabled if the length difference is near a multiple of 10, causing 70% higher delay for long adders. Bug? . . . → Read More: Cyclone/Stratix V Carry-Lookahead…bug?

TLB and Pagewalk Coherence in x86 Processors

Processors that support paging use TLBs to cache translations. On x86, translation caches are not coherent and requires software to explicitly invalidate a TLB entry after updating a page table entry. Similarly, pagewalks are not guaranteed to be coherent, so modifying a page table entry must be followed by an invalidation even if the page table entry is not cached in the TLB.

Real processor implementations do not provide TLB coherence, but it turns out many (but not all) processors actually do provide pagewalk coherence. Most provide pagewalk coherence by detecting when page table entry update conflicts with a pagewalk’s memory accesses, but some provide coherence by disallowing speculative pagewalks, at some performance cost. I show a microbenchmark that can test for TLB and pagewalk coherence and whether speculative pagewalks are used.

. . . → Read More: TLB and Pagewalk Coherence in x86 Processors

Store-to-Load Forwarding and Memory Disambiguation in x86 Processors

In pipelined processors, instruction are executed speculatively and are not permitted to modify system state until instruction commit. For stores to memory, speculative stores write into a store queue at execution time and only write into cache after the store instructions have committed. Out of order memory execution requires hardware that learns dependencies between stores and loads, and also the ability to forward stored values from the store queue to loads that depend on them. I describe two variations of a microbenchmark that can measure some aspects of store-to-load forwarding and the memory execution hardware. These showed that AMD’s Bulldozer and Piledriver processors likely do not use a dynamic memory dependence predictor. They were also used to generate interesting 2D charts that can reveal some details about how the memory execution hardware might be designed. . . . → Read More: Store-to-Load Forwarding and Memory Disambiguation in x86 Processors

AMD Bulldozer/Piledriver Modules and Hyper-Threading

Ever since Intel’s Hyper-Threading and AMD’s Bulldozer modules, there has been much debate on what qualifies as a real CPU “core”. Unfortunately, I don’t think “core” is easy to define, so marketing tends to name things for their own benefit. In the end, it’s the performance that matters, not the name. Two-way Hyper-Threading gives around 23% improvement over one thread, while two-way multithreading in a “module” gives 54%. This is still quite far from >90% that replicating the entire CPU core would achieve . . . → Read More: AMD Bulldozer/Piledriver Modules and Hyper-Threading