TideLog Archive for the “Work” Category
Today was another tinker day for me, and this time another YouView box from Humax. The T2100 replaced the ageing T1000 which was beset with issues, notably power supply issues with bad capacitors, as well as HDMI handshake problems, and just general reliability and use issues, such as recording failures (attributed to PSU issues, capacitors in the HDD 12v feed rail going dry or high ESR), freezing, and refusal to power on.
Most of the problems were down to the built in PSU which was fully onboard. The latest boxes, the T2100, T2110 and the newer 4K T4000 boxes are much smaller, and now use an external PSU brick, as well as smaller 2.5″ SATA HDD from a laptop, allowing Humax to shrink it massively. I have actually repaired a few in the past, mainly HDD failures due to 24/7 use and the Bathtub curve of HDD reliability being so unpredictable, but it’s the first time on TideLog for me to show you the wonderful neat innards, and much improved electronic design!
The only thing that’s needed is a Philips screwdriver. 4 screws on the bottom (self tappers into plastic, eurgh!) one of them under a warranty sticker (those things are just BEGGING to be peeled off!), remove the machine screw above the SCART socket, and off pops the cover!
I love the inside of these, I just love a neat circuit board, they’re a beautiful work of art in their own right. So, bottom left, the hard drive, which is a standard 2.5″ 500GB AV grade HDD, mine has a Western Digital AV-25 WD50000LUCT, which are designed for CCTV and PVR use, so are fine for 24/7 use. 4 screws underneath it hold it into place, the motherboard has to come out to remove it as it is actually screwed into the mainboard, not the chassis. One nice thing is that Humax have used rubber bumpers with the screws to shield it from knocks (not dropping it down stairs, as some of my repaired ones have been!) and also so that the screws are not in direct contact with the PCB.
To the bottom left of the HDD sits one of the box’s two USB ports, with the other being on the rear under the Ethernet port. To the right of the hard drive is what I assume to be the 12v regulator choke for the HDD, digitally controlled. To the right of that are the 4x1GB Samsung RAM chips, making a total of 4GB, which gives the box its stability power during marathon record 2-programmes-watch-another stints, to save it constantly caching everything to the HDD when rewinding, pausing or fast forwarding live TV, which, when recording two programmes and watching one would cause buffer issues, recordings would be glitched and the HDD under massive load.
Above the RAM is the CPU (possibly an ARM 2 or 4 core, I’ve never looked under the heatsink in one), under the big finned heatsink, which unlike the T1000, does not have a noisy fan cooling it down, it is fully passive, where the heat from it rises naturally out the top vents on the top cover. To the right of the RAM is the digitally controlled, beautifully intricate (shucks, I’m a true nerd!) 3 phase power regulation system. This splits up and regulates the 12v from the external brick, into all the voltages required by the components, and the CPU, which requires an ultra stable, clean and spike free supply. The HDD 12v regulator is fed from this system too.
The Broadcom Ethernet & dual-tuner control chips are to the top of these. This area also contains the white harness connector for the top cover switch block wiring which you can just see in the top of the image above. The switch block in the top cover is nothing special, just a PCB with switches soldered on 😉
There are still a few of the dodgy SamYoung (which brand does that sound like?) Chinese electrolytic capacitors in these units as there were with the old T1000 series. Luckily in the new T21xx series there are only 3 (the T1000 had 10), the rest are solid state, the ones in my unit are fine but we’ll have to see how long it is before they get binned for Japanese RubyCon YXF ones! There are two in the voltage regulation section, and one in the top left of the box below the DC IN switch/plug socket block that will likely be the first electrical repair I do 🙂
There’s not much info on the chips in these, I couldn’t make out the model numbers and a schematic is not in the wild. A future update to this article will be a teardown of the external brick, me using my magnifier (when I find it) to identify the chips and CPU, and exactly where that HDD regulator goes, if it is even for the HDD…
No Comments »
I had the misfortune of my tablet getting damaged this week. I was working in Rikku’s bus garage, with it resting on the bus’s bumper crossmember taking readings from the ECU, when Rik called me away. While I was away the vibration of the engine caused my tablet to fall off, straight onto tarmac! Luckily the screen hasn’t cracked.
I’ll show you all how to replace the digitizer, as it’s a lot more straightforward than also having to replace the screen, as less disassembly is required.
A. Removing the SD card cover/Wi-Fi antenna
First of all, your digitizer is GLASS, so you can use sellotape across the glass to hold it in place, preventing any injury, or shards of glass falling on the floor. Removing the damaged digitizer will stress it and maybe cause more damage as you do it. The glass provides 90% of the front frame’s strength, so once broken it loses most of its rigidity.
Once you’ve secured the glass, turn your tablet over, and remove the SD cover, which also doubles as a WiFi antenna, by locating the notch on the left, and lifting up. It unsnaps quite loudly, but be gentle. Once removed, place in a safe place:
B. Removing the rear casing
Next up we’ll be removing the rear case, which is easy to do as there’s no screws, it’s all clipped together. A lot of the reviewers of the Bush MyTablet reckoned the aluminium back was just cosmetic, but it is actually structural, and gives the tablet weight and strength to prevent flexing of the whole body to protect the internals,.
Using a flat blade jeweller’s screwdriver, unsnap one set of clips between the front digitizer frame and the rear case, and then use a plastic spudger to do the rest. DON’T use a screwdriver permanently, only to get a start. Note in my picture below, the lip of plastic on my rear case near the headphone port was damaged on mine in the impact, so I used this as an easy access point for my spudger:
Continue all the way round the case, and don’t worry about snapping noises. The screen is clipped to the backside of the digitizer, but as long as you are gentle, it won’t resist too much, and you shouldn’t break anything. The glass may crack and crunch on the broken digitizer at this point, due to the lost strength I mentioned earlier. I didn’t use tape on my glass as I was on a disposable cloth I could just throw away:
Once you’ve unsnapped all the clips, the rear case will just lift off. There’s nothing attached to it, so just lift it clear, upon doing so you’ll see the wonderous internals of the tablet, including the relatively large battery, and small mainboard. You can also see how the aluminium back actually constitutes most of the rear cover, with the plastic just being a small frame, proving my point about the strength the metal back provides:
When you remove the back cover, WATCH out for the power and volume button pack dropping out. It isn’t plastic welded or screwed onto anything, so it’ll just fall free:
C. Screw and connector locations
Now comes the preparation stage of locating the connectors and screws you’ll need to remove. If you’re just removing the digitizer, there’s 2 screws and 2 ribbons to remove, but if your screen is broken most of the internals have to come apart as the mainboard and battery are mounted to the back of the screen panel’s chassis with tape and glue, both of which are surprisingly strong!
The ribbon cable connectors for the digitizer and display are under the tape on the left and right sides, respectively, which I’ve labelled in red, the two red circled screws attach the PCB to the digitizer frame. The battery and speaker cables are under the orange tape on the bottom left. These two are soldered in, but don’t need to be de-soldered at all unless you are explicitly replacing them. Even for a screen replacement, desoldering these isn’t necessary, they can just be lifted out the way. To remove the battery for screen replacement, simply break the glue holding it in, and lift it out of the way after the rest of the disassembly is done, leaving the wires soldered in. Don’t do it yet, you’ll end up with a tangle!
The connector flaps for the ribbons need to be flicked upwards NO MORE than 90 degrees VERY gently. If you snap the flap, the whole PCB socket is ruined as the flap provides the torque to hold the ribbon in place, pressing the metal contacts together. Taping it back together is not good enough. DO NOT rush, the same goes for the left one. This is where unskilled amateurs make the jobs more expensive, take it from a professional who has fixed mistakes many times! Modern electronics are VERY delicate, and need eagle eyesight and jeweller’s finesse, shaky hands just won’t do!
From the left, lift the silver tape a little (DON’T damage or discard it as it can be re-used), and remove the ribbon for the power/volume buttons. Lift the flap gently, then ease the cable out.
From the right, lift the black tape. If you’re going to be replacing ONLY the digitizer, remove just the top ribbon that I’ve circled red, which is the digitizer cable, using the same care as for the power/volume ribbon above. If your screen is cracked, you’ll need to remove the bottom one as well, which is your display cable that carries display signals, and the backlight power.
Again, I can’t stress enough, DO NOT rush, and DO NOT force the socket connector flaps over 90 degrees, if they break you’ve just made the job 80% more expensive as you’ll need the sockets replacing, or a new PCB, which will involve data recovery off your old board, especially if you damage the touchscreen connector!
D. Removing bottom frame support
Where the speaker is along the bottom you’ll notice a plastic frame screwed into place. This is like a strengthener and support in one unit, it holds the speaker in place while giving the bottom of the digitizer some strength. It also carries clips that the rear cover was mounted to, so I consider it a main structural member of the whole tablet chassis. Simply remove the two screws, and lift it off the digitizer. Watch out as the speaker is now loose on its cable and will slide around!
E. Removing screen & mainboard assembly from digitizer
If you look all around the inside of the frame you’ll see lots of clips holding the screen in place. We’re now going to remove the screen VERY GENTLY. This is another step that you should take your time, there’s no medal for rushing it, as you WILL likely break your screen if you do it wrong, the glass on the screen is thinner than the digitizer. That’s the reason tablets have their digitizer separate to the screen, mounted half an inch away.
If your screen and digitizer are already broken and you’re replacing them both, I personally would still be careful, because I’m a professional, and normally it’s someone else’s equipment, which I respect 🙂
So, while unclipping the clips (they may be stiff) you can use a spudger to keep the screen from re-clipping itself in, but DON’T overdo it, don’t lever the screen too high with too many clips still securing it, it will flex and break. Obviously if your screen is broken and you’re replacing it this isn’t relevant, but still take care, because I would 🙂
The image below shows me using my spudger as the clips are unclipped, my screen wasn’t damaged before, and it wasn’t damaged after, apart from a scratch on the glass caused by the digitizer imploding on impact!
Finally, once that’s all done, you can separate the digitizer from the rest of the chassis, and pat yourself on the back for getting this far without any major damage, unless you DID damage something I told you not to, in that case it’s your fault for not listening to a pro, take yourself off to the naughty corner and think about what you’ve done!
Otherwise, if all went well, you’ll end up with the tablet looking like this:
Re-assembly with a new digitizer is the reverse of removal, if you remember my advice you should have a fully functioning tablet that acts as if nothing happened once it is rebuilt!
F. Extra steps for screen replacement
I only had to replace my digitizer, but if your screen is damaged as well, once you finish with the separated digitizer as step E, you’ll need to:
- Remove the display cable connector as I mentioned earlier
- Separate the battery from the screen back by removing the glue. When you reassemble the battery onto the new screen, use *new* adhesive strips instead of glue to secure it, as you don’t want it rattling around, its metallic case can short stuff out, which you DEFINITELY don’t want happening.
- Remove all the tape strips holding the PCB,
- If you’re also replacing the battery, desolder the battery cables, making sure you note the polarity. Resoldering the cables the wrong way may short the board out, and cause an expensive mess. I don’t know if the Chinese electronics in these have decent short-circuit protection, and I’m not willing to find out!
- Re-assembly, again, is the reverse of removal. With new parts, TAKE EXTREME CARE, you don’t want your new screen or digitizer damaged again! And make sure all the tape is replaced and secured in the original places. Mark out where the strips sit with a marker pen.
No Comments »
This is another wear related symptom, and often occurs on power down of an old system after a power cut. It is again to do with the input regulation circuit (the main resistor, diodes, and rectifier transistors bolted to the keypad chassis). If your Optima starts OK on battery, but not on just the mains, the cause of this is the CPU isn’t getting enough power to start up from the AC to DC rectification stage. The start sequence goes visually like this:
- Power is applied, the regulators get up to working voltage, and start supplying power to the CPU.
- The LED’s all come on, briefly, as the CPU boots up, doing its self test of itself, and the NVRAM, containing your code and exit/entry timers.
- Within a few milliseconds of 2 above, once the CPU has started, the LED’s go out, and the alarm now goes into a full alarm condition, leaving just the Power LED on, and any open Zone LED’s. If no zones are open, just the power LED is on.
If the LED’s all stay on with no more activity or sound, the CPU isn’t starting correctly, because the voltage to it is insufficient coming from the AC to DC rectifier stage. Allowing the alarm to start here going into full alarm, would cause too much current inrush, and voltage drop to sustain keeping itself running, due to the strobe/bell and 13v PIR’s drawing power when there isn’t enough.
The transformer puts out 16.2v AC. If the voltage at your battery charge terminals with no battery connected is less than 14v, (the last one I did was 10v) the whole system is being starved of power. The two transistors that are bolted through the keypad chassis need to be replaced, the big three-legged things top right of this picture with the holes through the tags:
I always replace both to make sure, as they can be quite stressed out at such an old age, and be breaking down under load, as does the 47 ohm battery resistor. I also check the capacitors accompanying them. The leftmost transistor is a Toshiba TA7805S Positive Voltage Regulator which seems to be the battery regulator, I’ve uploaded the datasheet to Tidelog HERE. The second (rightmost) transistor is an ST Microelectronics LT8I5CV, for which I cannot find a datasheet, and I assume is the AC rectifier stage’s main DC regulator.
I am attempting to find suitable modern equivalents for these regulators, so if any electronics guys out there can help, I’d be most grateful, as finding info on these 15+ year old components is tricky! I’m running out of working ones to cannibalize off old unrepairable Optima boards! A good alternative to the Toshiba TA7805S is the Panasonic AN7805F, the datasheet is on TideLog, HERE, for you to take a peek at, if you understand electronics 🙂
2 Comments »
Hoover washer dryers used to be synonymous with quality and could go 10 years plus without issues, but now they just seem to be dropping dead left right and centre when really young. In the space of one day today I’ve both had a Hoover engineer come out to my parent’s machine, for a motor replacement under Hoover warranty, and later that day I myself was called out to fix another Hoover washer dryer, both the same model, different faults.
The patient was a 2 year old WDYN856 DG washer/dryer, with no signs of life, except clicking noises, following a loud bang during a dry cycle. Clicking relays are usually always main control unit failure, so myself and Martin, my repair assistant, got to work. There was no other life from the programme selector dial, LED segment display unit, buttons, or their LED’s, apart from the clicking. We pulled the control unit out, it looked fine from within its casing, but once unclipped from it, we saw the catastrophic damage:
Can you guess where the actual brain of that massive washing machine is? Nope, none of the big components! That tiny chip that I’ve circled in red is the computer of the machine, smaller than a two-pence piece! The rest of the board is just power regulation, the control relays, and the outputs for the motor and element, plus all the connectors for sensors. The two small plugs on the very right-middle are the programming headers for programming the EEPROM. You can see the giant ferrite inductor coil, and those big heatsinks? That’s the transistor & Triac that control the motor speed, they act as an inverter and tacho control. The higher the switching frequency of those transistors, the faster the motor spins. They get mad hot, and very stressed, especially the massive transistor to the right of the coil.
Unfortunately, as you can see from the picture, around where the microcontroller is, that is where the failure has occurred. The area is all burnt, and has catastrophically shorted. The yellow highlight on the left is also where some damage to a diode, resistor and capacitor has occurred. The damage is actually worse than it looks in the picture.
We had to replace the motor, and the front-end option selection button unit as they were unresponsive even with a new control unit. We can’t be sure of the exact cause, but we suspect the motor has shorted, and as it’s directly wired to the transistors, has caused a massive short circuit, taking out the control unit and the option selection button unit (which itself had microcontrollers on it, but these were visually undamaged).
Unfortunately you can’t just buy a new control unit and connect it up, the EEPROM needs to be programmed with machine specific code, the machine will just flash an EEPROM communication error otherwise. We had the Hoover engineer programmer, so were OK 😉
No Comments »
You’ve all seen them, the bulbs that are supposed to help the environment, with swirly tubes, short and fat, long and thin. Their efficiency comes from the fact that, unlike filament bulbs, the tubes don’t draw their current direct from the mains, instead they use a kind of inverter, known as a “ballast”, very similar to the ones used in laptops for backlights. Except these run on high voltage inputs, unlike a laptop inverter which will run off 9 to 15v DC and provide 1000v AC ignition voltage, with 300 to 800v run voltage depending on brightness setting.
Unlike a laptop, though, fluorescent lamps aren’t variable brightness. The ballast puts very little load on the AC input, instead it itself provides the current to drive the tubes, like a middleman.
Compact fluorescent lamps have some benefits in comparison with standard filament light bulbs:
1. Lower power consumption (as much as 80%) and
2. Much longer life expectancy when used in the correct environment with airflow (5 to 15 times)
1. Longer warm up times (mainly only experienced with cheaper bulbs)
2. Cannot be run off a dimmer switch.
3. Cheaper bulbs tend to be failure prone under heavier use more than 3 hours per day, or if not provided with adequate cooling around it.
4. More expensive per bulb than a filament one, but cost savings are made over its life to offset initial cost.
5. Depending on the colour temperature of the bulb, lower colour temps are not suitable for use as backlighting when using a camera.
Available colour temperatures
Fluorescent lamps are available usually in these color temperatures:
- Warm white (2700K)
- Cool white (4000K)
- Daylight (6000K)
The most common colour temperature is “warm white”, which is close in brightness to a classic 60W filament bulb and also is most pleasant to people, but cannot be used as ambient light for use with a camera.
Principle of construction and operation
Compact fluorescent lamps use a vacuum tube similar to classic strip lamp, the principle of energy transformation to visible light is the same. On either end of the tube are two electrodes coated with Barium, the tube is filled with Argon and Mercury. The cathode runs at high temperature (about 900 degrees Celsius) and generates many electrons which are accelerated by voltage, bouncing between electrodes, hitting the atoms of Argon and Mercury. This gives rise to low temperature plasm. The mercury energy radiates in a UV light form. The inside of the tube is coated with luminophore (phosphor), which transform UV light in to the visible light that you see.
The tube is powered by alternating current, provided by the ballast, so the electrodes (cathode and anode) switch on and off, alternating rapidly. Because of the use of a switched converter in the ballast, which runs on tens of kilohertz, the CFL lamp doesn’t flicker in comparison to a classic strip tube lamp. The converter, which is present in the screw or bayonet cap, substitutes the starter found in traditional classic strip lamps (which are wired direct to AC line), making CFL’s more efficient.
Here’s a look inside a Philips Genie 11W, for the curious electronic nerds out there 🙂
To help understand that little circuit board a little more, here’s its schematic diagram:
Theory of Ballast operation
The lamp requires a current to preheat the filaments, a high-voltage for ignition, and a high-frequency AC current during running. To fulfill these requirements, the electronic ballast circuit first performs a low-frequency AC-to-DC conversion at the input, followed by a high-frequency DC-to-AC conversion at the output.
The AC mains voltage is full-wave rectified and then peak-charges a capacitor to produce a smooth DC bus voltage. The DC bus voltage is then converted into a high-frequency, 50% duty-cycle, AC square-wave voltage using a standard half-bridge switching circuit. The high-frequency AC square-wave voltage then drives the resonant tank circuit and becomes filtered to produce a sinusoidal current and voltage at the lamp.
During pre-ignition, the resonant tank is a series-LC circuit with a high Q-factor. After ignition and during running, the tank is a series-L, parallel-RC circuit, with a Q-factor somewhere between a high and low value, depending on the lamp dimming level.
When the CFL is first turned on, the control IC sweeps the half-bridge frequency from the maximum frequency down towards the resonance frequency of the high-Q ballast output stage. The lamp filaments are preheated as the frequency decreases and the lamp voltage and load current increase. The frequency keeps decreasing until the lamp voltage exceeds the lamp ignition voltage threshold (up to 400v) and the lamp ignites. Once the lamp ignites, the voltage drops and the lamp current is controlled such that the lamp runs at the desired power and brightness level.
Common failures are faulty output capacitors, a major fault in cheaper bulbs, where cheaper components are used. When the tube doesn’t light up on time, or fully, there is a risk of destroying the transistors and their resistors. Lamp startup is very stressful on the ballast circuit, transistors usually don’t survive overloading at high temperatures, taking out the transistors fed by them. When the tube fails, the electronics are usually destroyed too. When the tube is old, the filaments become worn, causing high resistance to the circuit and either tube doesn’t lights up anymore. Normally in this case the electronics usually survive because the ballast will shut down if there is a loss of load caused by death of the tube. Sometimes the tube can be wrecked due to internal tension and temperature difference. Most frequently a stressed tube fails, when powered on, making it look like the whole lamp has failed.
Failure of the whole lamp at its worst is normally limited to a little bit of smoke, and/or a bad smell, and a small pinging noise. They are not allowed to “POP!” or cause direct shorts on the AC line, the input fuse on the ballast will prevent that.
Repair of electronics
Repair of the electronics usually means change of capacitors. When the fuse has popped, this signifies possible damaged transistors and resistors. Failures can be multiplied. For example, when there is shorted capacitors there can be thermally overloaded transistors that will be destroyed. The best transistors for replacing of original types are MJE13003, but they are not easy to find recently. I replaced them with BD129, but they are not available now. There exists other variants like 2SC2611, 2SC2482, BD128, BD127, but I am not sure if they will be long-life.
A fluorescent lamp is usually comprised of two parts. One is the plastic cover with holes for the tube and vents, and the plastic clips to attach to the bottom section. The tube is glued in using high temperature epoxy or cement glue. The bottom section has slots for the clips from the inner side. Inside is the printed circuit board with components and wires from the tube. From the upper side of the PCB are wires to top of the lamps, which are soldered or stamped to the contacts on the PCB, normally metal posts. Both plastic parts are clicked together and sometimes glued. Usually you can carefully leverage the casing with a small screwdriver sequentially to release the glue. Next you must leverage more to open the lamp. To close the lamp housing after repair you can only click both plastic pieces together.
Sometimes opening these lamps up is harder than the repair as the housing often gets damaged, lamps that have been heated and cooled regularly tends to lead to the plastic becoming brittle and hard to separate!
No Comments »
Electric showers are great, but they do go wrong occasionally. At Kitamura we repair all types of showers. A lot of people seem to confuse “power showers” with “electric showers”. They aren’t the same. An electric shower simply heats the water, the water goes through the shower under simple water pressure itself. That is where power showers differ. They still heat the water, but they also have a motor assisted water pump, which acts like the turbocharger in an engine, where a little amount of pressure is converted into massive pressure by an impeller.
We recently got called out to a faulty Mira Essentials electric shower. These were made in 2000, and this one was suffering from random pressure drops, and weak output. Here’s a shot of under its cover, I’ve labelled its parts which I’ll explain below:
A. Water input w/filter
The cold water input, with filter. This is a gauze filter that filters any silt in the water. If not filtered out it could collect in the water heater, and cause failure, or blockage in other parts of the shower system.
B. Water impeller.
This is not electrically assisted as in a power shower, but it helps to keep the shower running if there is momentary pressure drop due to something else being used in the water system like a tap.
C. Power and Temperature knob with flow solenoid
This is the ON/LOW/MED/HIGH selector, which works in tandem with two microswitches, and two heating elements. When the shower is switched on, the electric flow solenoid opens, allowing water flow. In the LOW position the water heater is fully switched off, and the water is cold as all microswitches are open. In the MED position, one microswitch is closed, so one of the elements is active, and in HIGH both switches are closed, making the heater operate at full wattage, in this case 4.2kw.
D. HIGH microswitch
This is the microswitch that operates the second element by turning the temp knob to HIGH as above.
E. Temperature knob.
This works by varying the amount of water that gets through to the output. By reducing the speed of water flowing through the heater, it makes the water hotter, and increasing it makes it colder. If the Mode selector is HIGH and the Temp knob turned all the way to HOT, the heater would be shut off by the TCO (Thermal CutOut) on the heater as the water temperature is too high, which will cause scalding to the person using it, and also damage to the heater.
F. Neon indicator PCB
This board contains the neon indicators for Power, Overheat, and Low Pressure. It also contains resistors to prevent premature wear of the neon bulbs, they are run from 240v and don’t last long, especially the POWER indicator, as that is on as long as the mains is on.
G. Mains input terminal block
Self explanatory, this is where the mains is wired in to the shower. In this case the shower had its own switch and fuse in the consumer unit, so we didn’t have to turn the electricity off to the customer’s entire house while we worked!
H. Water heater with TCO (Thermal Cut Out)
Here’s where the water is heated before going to the shower head. The two elements are individually controlled by the microswitches previously mentioned in C, controlled by the MODE knob. The heater contains a thermal cutout so that the elements are turned off if the water gets too hot. Once the water reaches a certain colder temperature, the thermal cutout switch turns the elements back on.
The thermal cutout is normally only activated if the temperature knob is on HIGH, and the TEMP knob set to its hottest, which is minimal water flow, as mentioned in E.
No Comments »
A customer recently booked her Samsung TV in for repair with us, saying it was clicking, and not actually coming on. I’ve had this problem with an old Toshiba plasma of Kana’s over at White Tiger Martial Arts Academy, it was clicking badly, but actually worked, with visible distortion on dark scenes. In that case though, it turned out to be the plasma panel itself stressing the supply, as we couldn’t source a panel at less cost than the TV it had to be written off.
LCD’s though are much easier to source, and will never actually stress a high voltage supply as they themselves only run at 5v DC on their LVDS bus, so once we’d picked the customer’s TV up from her house, we took off its cover, and took a look. It turned out to be one of our most common problems: Bulging capacitors! Except these were high quality Korean Sanhwa ones. Once you get the cover off (just 12 screws, no plastic clips unlike Vestels!) you’ll see the PSU on the chassis. It uses a massive flyback transformer and opto-isolator for SMPS feedback, to power the backlights. If you thought flyback transformers died with CRT’s, you were wrong!
Follow this procedure to remove the supply:
CAUTION: WAIT at least 30 minutes if the TV has been plugged in. If you are experienced in electronics you can discharge the main filter capacitor using a resistor, if not, leave it unplugged for a while before continuing, and have a brew, thinking about how you’ll proceed, and make notes. I find cuppa-plan time to be very productive, and it keeps me safe, even as a professional. There are high voltages present that can KILL!
1. Remove all connectors I’ve coloured GREEN. They have tabs on them which you must push as you pull the connector. DO NOT pull them out by the wires, you’ll rip the socket off the board and damage the socket pins, making this cheap repair much more expensive.
2. Unscrew and remove all screws I’ve coloured RED, and put them somewhere safe. Remove the board by lifting it by its EDGES, not by a transformer or capacitor, or any other component. Place the board on a suitable workspace, with plenty of room, and an antistatic mat. SMPS supplies contain surface mount components and controllers which are easily damaged. Simply walking on a carpet generates 70,000V which we can only feel as a slight shock as there’s hardly any amps, but that is more than enough to wipe semiconductors out!
3. You’ll notice near CN801 there’s a bunch of capacitors, and some of them will likely be bulged, or have actually vented. If any vents have burst, you must clean the electrolyte off as soon as possible, as it’s corrosive to the board and other components. I generally replace all output caps if any have become damaged, as they will have been stressed. On my board there were 2ea 2200uf capacitors that were bulging. Remove the old capacitors ensuring you don’t overheat or damage the copper pads/tracks on the circuit board.
CAUTION: Take care to ensure you install the new capacitors correctly. They are polarity sensitive. The board and capacitors will be clearly marked which way they should be inserted. Shorted or polarity-reversed capaitors can EXPLODE and/or damage other parts of the circuit.
You should check and if necessary replace any adjacent capacitors that are rated 10V as these seem to be the ones more likely to fail. In my case I also replaced the 1000uF capacitor. Capacitors, contrary to misconception do not have to look visibly damaged to be faulty, they can be internally dried out.
NOTE: If replacing the capacitors does not resolve your symptoms you may need to replace the EEPROM chip on the main board as it can be corrupted or damaged by the power cycling. This will need to be done by a professional as the software contained in it can be TV specific.
No Comments »
I’ve had this problem a few times on my laptop. It occurs mostly when the power suddenly goes off and it switches to battery. You lose all capacity monitoring, and can’t tell how much is left. The system tray icon changes to this:
Microsoft’s forums are hilarious. Their “Most Valuable Professionals” give the funniest canned cut ‘n’ paste responses, from, “Your power driver is corrupt” to your “Windows needs reinstalling!”. I know exactly what causes it, and it ain’t anything to do with “power drivers” or corrupt Windows. It’s the little monitoring chip in the battery. Like a lot of integrated electronics, it sometimes gets confused. Sudden switchovers from mains to battery tend to cause it, especially if there’s any surges from the battery as it kicks in.
The age old advice of “Reboot!” is the wise advice. If that doesn’t cure it, turn your machine off, remove the mains and battery, and hold your power button down to discharge the circuitry in your device (apart from the RTC circuit, but this doesn’t matter), that should cure it. Removing the battery opens the circuit to the sensing system in the battery, and resets it.
Simples. I hate MVP’s, they go on a 5 day course and think that gives them a Professional title? I’ve done MVP courses, but have the skills and years of software and electrical experience to further and back them up
No Comments »
I recently had a TideLog reader, Steve, contact me about his Menvier TS800 control panel, saying the panel was fine, but the charge voltage was intermittent, even with a new battery. A few days afterwards he dropped it off to me, lo and behold, just like the Optima, a worn resistor, under the keypad. Here’s a picture of what it should look like, and where it is located:
The one I’ve highlighted in green, labelled R52, supplies the +ve 13.6v feed to the battery, via D14 to the bottom left of it, which also seems to supply the telephone module terminal block with +ve voltage too. R83, which is the green resistor highlighted in blue, supplies the AUX 12v for PIR’s and such, and 12.6v to the bell.
Check both resistors, and all diodes for continuity and correct resistance, use my band code chart, in the Optima article, by clicking HERE. R52 on Steve’s board wasn’t badly burnt, but the resistor ceramic coating, along with the colour bands, had come off, there was slight burn evidence at the solder joints, and the voltage was stable until the board was under load, once the resistor warms up it breaks down when loaded with a flat battery on the charge rail.
No Comments »
When a hard disk is manufactured, there are areas on the platter that have bad sectors. Considering that on a 2 TB hard disk there are 4 billion sectors, then a few bad sectors is only a tiny proportion of the total number of sectors on the drive. During the test phases of a hard disk, the platters are scanned at the factory and the bad sectors are mapped out – these are generally called ‘Primary Defects’. The primary defects are stored in tables in the firmware zone, or in some cases the ROM of a hard disk. When you buy a brand new hard disk, you will most likely be completely unaware of these bad sectors and the numbers because they are ‘mapped out’ using ‘translator‘ algorithms.
Modern hard disks use Logical Block Addressing or LBA, this describes the sector numbering system on the hard disk, and goes in sequence
0,1,2,3,4,5,…..n-1,n (where n is the last sector on the drive.
Spare sector pools
All modern hard disk drives have a spare sector pool. This is used when bad sectors develop during the normal life of the hard disk and any newly found bad sectors are ‘replaced’ with good ones from the spare sector pool. This process is invisible to the user and they will probably never know that anything has changed.
How Bad Sector Mapping Works:
There are at least two methods of bad sector re-mapping (or translation) these are P-List and G-List.
- P-list are defects found during manufacture and are also know as Primary Defects
- G-List are defects that develop in normal use of the drive and are known as Grown Defects
There are other defect lists found in modern drives but the principles are similar. For example, you may find a T-List or a Track defect list, or an S-List or System area defect list.
Lets get into how these defect lists actually work, so let’s say we have a small hard disk with 100 sectors and a 10 sector spares pool.
When bad sectors are found at the factory, shift-points are entered into the P-List, if we take the following LBA sequence 0,1,2,3,4,5,6,7,8,9,10 …99, 100 Lets say that Sectors 3, 6 and 9 are found to be bad. When the first bad sector is found, the first part of the re-mapping process will look like this
What happens here is the bad sector at position 3 is recorded in the P-List. The new map now looks like this;
0,1,2,P,3,4,5,6,7,8,9,10 .. You can see now that 3 is where 4 was.
The next bad sector at LBA 6 is now found
0,1,2,P,3,4,5,B,7 and is again mapped out giving 0,1,2,P,3,4,5,P,6,7
When the whole sequence is complete, our final map looks like this.
Because these sectors are mapped out, the user will never be aware that they exist. If you want to look at sector 6, the drive will translate that to physical sector 8. It takes the 6 and adds the shift points to it, +1 for the bad sector at LBA3 and +1 for the bad sector at LBA 6. When the testing gets to the end of the drive, in order that it is of the correct size of 100 sectors, it allocates the sectors from the spare sector pool completely concealing the fact that there are bad sectors on the media. To all intents and purposes the drive looks just like the original as 1,2,3,4,5,6,7,8,9,10. However, our spare pool has reduced in size and there are now 7 sectors remaining in the spares pool.
After using the drive for a while some bad sectors develop the drive takes care of these using a grown defect list.
The grown defect list or G-List is a table containing the location of bad sector defects found during normal operation of the hard disk drive. When a bad sector occurs during normal use of the drive, something a similar process to P-List generation occurs – resulting with the bad sectors being mapped out. The process for G-List mapping out is slightly different. Lets say our hard disk develops a bad sector at the current LBA 6. What happens in this case is first the bad sector is mapped out. Giving; 0,1,2,3,4,5,G,7,8,9,10 .. A sector from the spare pool is allocated in the bad sectors place. We used 3 of these sectors in factory testing, so the next available bad sector is 104 this now becomes mapped to LBA 6 so our sequence would look like this; 0,1,2,3,4,5,104,7,8,9,10
Again, this process is completely invisible to the user and will still look like the original sequence of 0,1,2,3,4,5,6,7,8,9,10
You might ask, ‘why don’t the new defects get added to the P-List?‘ the answer is that if you add a grown defect to the P-List it has the effect of shifting the data up the drive for each sector from the point where the new bad sector is found. If you look again at the methodology behind the P-List it will help you understand this.
Where a G-List entry can help to revive hard disk, if there was data stored in the original sector attempts then usually it is lost. This may appear to the user as a file that not longer opens, or a a program that doesn’t run anymore or some other errant behaviour. This will not become apparent until the next time the file is attempted to be opened. It may also be that it is such a long time since it was opened that a backup plan means there are no backups of the working version. So bear this in mind when developing you backup plan.
Defect Mapping in a live system
When a hard disk is powered up, the p-list and g-list are usually loaded into RAM on the controller card. As requests for data come through, the location where the data is required from is passed to the translator, which makes the calculations necessary so as to determine which sectors to actually read in order to get to the actual data requested. In our example above, if we wanted the data from LBA 6 the translator would first run through the p-list and add 2 sectors to the count for the two bad sectors found at the factory, it then checks this value in the G-list and finds it has been re-allocated to sector 104. It then reads sector 104 and presents you with the data.
All the magic that goes unnoticed by normal people 🙂
No Comments »
In this article I’m going to help you diagnose and identify control problems in your washing machine. Modern washing machines all contain a computerized timer and control system, this is responsible for controlling all the different circuits in the machine such as the sensor network, motor, drainage circuit and the heating circuit. The main thing to remember, whether diagnosing a modern sensor washer, or a car electrical network, is that computers run off the same basic principle. Inputs and outputs, if a computer can’t get a reading or signal from an input, the output can’t happen, either at all, or efficiently, the computer then has to fall back to what are known as “reference values” stored in a ROM. An example of this in a washing machine, is if the computer can’t determine the water temperature, it can’t heat it correctly.
An example in a car would be if it can’t detect how much fuel is being injected to the electronic injectors, the emissions are affected and it has to fall back to reference values stored in “injection maps” in the ECU as it’d be using too much fuel and it would cause combustion problems.
If you suspect that there’s a fault in one of the circuits in your washing machine it’s usually much easier to test all the components in the circuit before suspecting that the problem lies in the control board. Problems in the wiring network of a machine are much more likely the cause. For example in the drainage circuit you would check the drain pump and the wiring; the heating circuit you would check the element, the thermostat and the wiring. If you then suspect that the problem still lies in the circuit board unfortunately it’s usually quite difficult for inexperienced people to test the board and you’ll actually just need to replace it altogether or consult an electronics guy like me.
The first problem we’re going to look at is program issues. Most modern machines are designed to shut down if they detect a fault somewhere in the system and this is usually accompanied by a fault code. A fault code is displayed on the front of the machine as a combination of letters and lights or numbers. These fault codes vary from one manufacturer to another so it can actually be just as helpful to watch your machine to diagnose where the fault is, such as in the drainage circuit or the heating circuit. When you turn your machine on, the first thing it does is to lock the door via the electronic door lock solenoid and it’s then that it performs a self-check. If in the self-check it detects a fault somewhere in the system it’ll shut down and display a fault code, or if the door doesn’t lock properly it’ll also detect that as a fault and shut down.
Once the machine has passed the self-check stage, it will proceed to fill with water through the fill valves (solenoid valves) at the back. Most stages in a washing machine cycle are programmed to complete within a predetermined time so if your machine doesn’t recognise that it’s filled within a couple of minutes it will usually shut down and display a fault code to stop any flooding occurring. Assuming that’s OK and the machine has filled with water it will then move on to the next stage in the wash cycle. Once the water has filled to the correct level the machine will then start to agitate it and heat it if required by that particular cycle. Once the temperature has been reached the machine will then wash for a certain amount of time before draining the water away and again, this has to happen within a predetermined time so if it doesn’t, the machine will shut down and display a fault code.
Once the water’s drained it will then do a short spin and this is followed by the rinse cycle. The rinse cycle is very similar to the wash cycle, except in the rinse cycle the water isn’t heated; water is brought in to a predetermined level within a certain amount of time, it’s then agitated, before being drained away. Most machines have at least two rinses in the rinse cycle and on the final rinse both solenoid valves at the back of the machine open up and flush any conditioner from the detergent drawer down into the drum. Once the machine has completed the rinse cycle it will prepare for the final spin by balancing the load. It does this by attempting to evenly distribute the weight of the load around the drum by using sensors to detect drum wobble on either side of the drum. However, if the load contains a particularly heavy item – such as a pair of jeans or a towel – amongst an otherwise lighter load, it will attempt to balance that heavier item amongst the load. If it can’t balance the load it will simply refuse to spin or it may just shut down and display a fault code.
However, once the load has been balanced the machine will spin and complete the wash cycle. If your machine is dead and it’s not displaying any lights or anything on the front then you’ll need to check it for continuity. Firstly, just unplug it from the wall and have a look at the fuse inside the plug to make sure it hasn’t blown; once you’ve established that it hasn’t, you’ll need to check for continuity between the plug and the control board. Don’t just replace the fuse and plug back in, the fuse has blown for a reason, and until the reason is found and fixed it will likely blow more fuses.
If you follow the path of the plug in through the machine, some will come through to a filter board, it then passes along to the plug on the control board. Grab a multimeter on a resistance or continuity setting and just check for continuity between the two. If you can see that there’s continuity on both connections, that shows power is getting to the circuit board, but there’s probably a fault inside – the way we need to check is by replacing the board with a new one.
Next, let’s have a look at if your machine is blowing a fuse when you plug it in; usually this is caused by a short circuit somewhere in the machine and the short can either exist in the control board or within components around the machine. You can check very easily for a short if you unplug the machine and, using a multimeter on a resistance reading, check for the short across the plug through live and earth, and live and neutral. If there is a short there it’s going to show up as a resistance reading of less than a couple of ohms. Often the first thing to short is the heating element, so try disconnecting that and testing again for a short circuit; if the short has gone then that would indicate that the short does lie in the element. Obviously you can double check by testing the element itself and for a working element the reading you’re looking for is somewhere between twenty and fifty ohms so anything outside of that reading means you’ll need to replace the element.
On the other hand, if removing the element doesn’t get rid of the short the next thing to disconnect is the circuit board and again, once you’ve done that, check for a short there. If the short still hasn’t gone, move further along the line and try checking on the filter board. If the short still hasn’t gone then then it’s likely to be in the plug and the cable and you’ll need to replace those. To test the element, first disconnect the connector lugs and then turn your meter onto a high resistance setting and measure from earth to one of the terminals – from this you shouldn’t get a reading. Then put your meter onto a low resistance setting and measure across the element – on this one I’m getting a reading of about twenty-seven to twenty-eight ohms, so that indicates that this one is OK.
If your machine is tripping the electricity, the process for diagnosing is largely the same, however it may be that a normal meter won’t show any fault being present. In such a scenario an engineer would use an insulation tester such as a Megger and this produces five-hundred volts for determining where the breakdown has occurred. Again, it’s likely to be due to the heater, or heaters if it’s a washer-dryer appliance, but if the tripping is occurring during the final spin the motor is likely to be at fault, where it’s being worked at its hardest during that part of the cycle. One final thing about control boards and tripping faults is that a lot of the time it can be difficult to conclusively diagnose the fault; sometimes a fault that’s being caused by another component actually appears to be caused by the control board. Similarly, if you’re replacing the control board, many of them now require professional programming on installation.
I’m here to help, don’t be afraid to ask!
2 Comments »
I repaired a Luxor LUX-19-822-COB 19″ TV/DVD Combi unit for Greg’s aunt and uncle recently. It is a Vestel, and uses both a 17IPS16-4 combined PSU/Inverter unit and a 17MB46 mainboard. According to Greg, and his aunt & uncle, there was no sound or picture. It turned out there was sound and picture, but no backlight. Inserting a DVD made the TV start the DVD software, and the DVD began playing. There was sound and a faint picture on the screen. All voltages coming out of the PPSU section were present and stable, the TV was running quite happily, albeit with no backlight.
So the power supply was sitting in standby, all main voltages off. Pressing the Power button was bringing the PSU into full On mode, and the voltages were nice and stable, as they should be. This proved the TV’s MCU was interpreting the ON signal and pulling the PS_ON pin high, and the PSU was starting correctly. The DVD module would have shown if there was any failure on the voltage lines as it wouldn’t have accepted the disc, and would have been sluggish or appear dead. The TV wouldn’t have even booted to the DVD Software if the drive wouldn’t start, which I believe is stored on the DVD drive’s Micron EEPROM on these drives, it isn’t part of the TV mainboard.
I connected two laptop backlights to it, they flickered then went out, there was no “2 seconds to black” symptom, or red tinge, it was a “blink-and-you’ll-miss-it” scenario. Connecting the TV’s screen lamps up to a laptop inverter revealed they were fine, with no ignition lag or red/pink tinge, so my diagnosis had narrowed the fault down to the Inverter/PPSU unit. Due to not having my oscilloscope or capacitor tester handy, I ordered a new PSU/inverter board, which will be fitted soon. I don’t know what the actual fault is with the original supply, but I will be repairing and re-using it. I’ll update this post when I do repair it. Here’s the TV with the cover removed, you can see how it all fits together:
Another PSU/inverter combi unit that made life easier for me, due to it not doing the protection shutdown feature of separate inverters, where the inverter sends a signal to the TV’s MCU that it has a fault, this then causes the MCU to either not start the TV fully to software boot stage, or to shut down into protection mode if started. Combi PSU/Inverter units do make diagnostics easier, as the TV still starts if the inverter, well, doesn’t 🙂 If the whole TV doesn’t start, the PSU section needs to be looked at, you might have flaky fluctuating voltages coming out of PL804, the L shaped section of pins that connect to the mainboard. Here’s an image I made showing the pinouts of the 17IPS16 connector, the mainboard connector, and the DVD module connector, click it to view it full size:
Please note, this is my image, I created it using a schematics drawing program, so please don’t distribute it. If you reference it on a forum, please link to it, don’t copy it or modify it.
No Comments »
Some Vestel TV’s have combined PSU and inverter in the same board, as did a Polaroid branded ProView chassis I did a year ago. Common faults include:
- Red tinge on screen, which gradually goes to normal white, but then the backlight(s) go out. Known-good backlights connected up work fine
- Backlight(s) flickers briefly and goes out again, but TV still operates with dim picture under bright light, known good backlights do the same
- Backlight(s) shows no life at all, neither does a known good one connected up, but TV stays running
The red tinge issue is mostly a failing backlight(s), because the anodes of the lamp are wearing out it doesn’t warm up fast enough, causing the inverter protection circuit to shut it off. The other two are faults with the inverter section of the supply itself. On a combined PSU & inverter unit the backlight supply is fed with 24v directly from the secondary side of the SMPS circuit, where it is stepped up to 1000v lamp start current by the inverter transformer. Once the lamp has lit (usually within a few milliseconds for a good lamp) the voltage drops to between 300 and 800v depending on the brightness. One such PSU I recently reconditioned was a Vestel 17IPS01, we often replace faulty supplies with new ones, then recondition the old ones for re-use.
The backlight flickered momentarily and went out. We connected two known-working laptop screen backlights to it, and the same occurred. Note that on a two lamp inverter two lamps MUST be connected, if only one is used this isn’t enough of a load and the protection circuit will trigger. On this supply it turned out to be the output capacitors directly before the backlight output sockets. These act as soft-start filters to prevent damage to the backlights from sudden current inrush as they light, if the 1000v ignition current was suddenly applied without a few milliseconds delay the anodes would be damaged:
To fix the issue, replace the three caps C355, C356 (12pf 3kv) and C354 (4.7pf 3kv) that I’ve labelled above. The capacitors will check out as normal with a multimeter when power is off, however they break down when put under load, hence the reason why they come on then go off again as the inverter protection circuit shuts the circuit down. Faulty backlight lamps can also stress them out causing them to fail. This repair is good for most Vestel and other make combined PSU/Inverter boards.
Unlike TV’s with separate inverter boards, a TV with combined PSU/Inverter supply will always boot up and work even if the inverter section shuts down, there is no fault feedback to the main TV processor in these variants. This makes diagnosing faults with them easier, whereas a TV with independent inverter will often cause the TV to not start correctly in the event of a fault, sometimes causing confusion with inexperienced people.
2 Comments »
Techwood are another “faceless” brand of TV that are Vestel made, they are sold at Morrisons, we have done a few of them. Recently a customer brought her 32″ Techwood TV to our workshop, saying there was power, but no picture or sound, and the LED blinks. Normally on a Vestel mainboard the LED only blinks during Over-The-Air firmware update, so she left it with us, and we took a look under the bonnet.
It uses a Vestel 17MB25-3 mainboard, similar to my 16″ Linsar 16LVD4. Another common cause of a blinking LED on a Vestel can also be the inverter. Often on these separate inverter TV’s, if the inverter chip, a MOSFET, or indeed the coil (some have 1 some have 2), are faulty, the error reporting circuit built into the main chip on the inverter relays this info to the processor and the software goes into protection mode, preventing the set from booting up, causing the flashing LED.
A blinking LED in any other case than firmware update is mostly a power supply issue, but here’s our checklist:
- Check microfuse FS105 on main board, near CAM module.
- Check SMD fuse FS106 (4A) for open-circuit. Check diodes D881 and D893 (UF5402) for short-circuit.
- 24V rail might be short-circuit, disconnect inverter board supply to prove. Check and replace dual N-channel MOSFETs IC803 and IC804.
- Replace IC830 (FAN7711 and SMD capacitor C925 (1nF).
- Check for short-circuit between pin 6 and pin 8 of IC830 (FAN7711). Replace IC830 and SMD capacitor C833 (100nF).
- Check and replace SMD transistor Q839 (BC859).
- Check D893 and D891 (UF5402) in centre of PSU for short-circuit.
- Voltage at regulator U122 varies between 4V and 9V instead of being stable at 8V. Replace U122 (LM1117ADJ) on main board.
- Check C961 (470uF/35V) at top of PSU.
- Check D893 and R1036 (2.2K) on PSU.
- Replace IC830 (FAN7711).
- Check D893 and its feed resistor. Replace MOSFETs Q813 and Q814 and IC830. Replace C828.
- Replace C892 (100uF), C801 (33uF), C840 (33uF).
In our case, the voltages coming out of the power supply were fluctuating badly, so we knew the mainboard wasn’t the original fault. The power supply is a Vestel 17 PW26-3. So, out came our service manual and power supply schematics, and we set to work.
No obvious bulging capacitors, no burst vents, and no burn marks anywhere. The PSU microcontroller was operating, and the clock waveforms were fine. The bridge rectifier system and components were OK. The problem was the output capacitors just before the output sockets, their voltages were up and down.
Replacing C802, C934, Q828, D882 & D883 and some various resistors fixed the problem, and the TV was back up and running. The power supply even emitted less no-load whine than it did before! A lot of people might say Vestel are junk, but the thing I love about them is they’re off the shelf parts and components, fixing them is a joy. You’ll often find a lot of Vestel TV’s with all manner of screen sizes, big and small, being driven by the same boards and power supplies. The ProView panels can also be replaced with dual lamp screens from a Sony Vaio laptop if you need a 15.4″, 16″, or 17″ panel 😉
Here’s a fun fact: The 17MB25 board can drive panels from 12″ all the way to 36″ full HD, even if the TV is marketed as just HD Ready, that’s often just because the panel is a small one, so the board software goes into HD Ready mode to drive it. Connect a 32″ to it and the multiplexer goes into full steam 1920×1080 🙂
Vestel? Turkish? Delight? Yeah, I would say so, cheap, cheerful, and easily fixed. Take note, Sony 😉 Funnily enough Greg has a Luxor branded Vestel lined up for me with similar symptoms, I suspect it’s going to be the same routine as this one.
1 Comment »
Many electronics manufacturers, including HDD manufacturers like Seagate, have been using the industry standard “Mean Time Between Failures” (MTBF) to quantify disk drive average failure rates. MTBF has proven useful in the past, but it is flawed.
To address issues of reliability, Seagate is changing to another standard: “Annualized Failure Rate” (AFR).
MTBF is a statistical term relating to reliability as expressed in power on hours (p.o.h.) and is often a specification associated with hard drive mechanisms.
It was originally developed for the military and can be calculated several different ways, each yielding substantially different results. It is common to see MTBF ratings between 300,000 to 1,200,000 hours for hard disk drive mechanisms, which might lead one to conclude that the specification promises between 30 and 120 years of continuous operation. This is not the case! The specification is based on a large (statistically significant) number of drives running continuously at a test site, with data extrapolated according to various known statistical models to yield the results.
Based on the observed error rate over a few weeks or months, the MTBF is estimated and not representative of how long your individual drive, or any individual product, is likely to last. Nor is the MTBF a warranty – it is representative of the relative reliability of a family of products. A higher MTBF merely suggests a generally more reliable and robust family of mechanisms (depending upon the consistency of the statistical models used). Historically, the field MTBF, which includes all returns regardless of cause, is typically 50-60% of projected MTBF.
Seagate’s new standard is AFR. AFR is similar to MTBF and differs only in units. While MTBF is the probable average number of service hours between failures, AFR is the probable percent of failures per year, based on the manufacturer’s total number of installed units of similar type. AFR is an estimate of the percentage of products that will fail in the field due to a supplier cause in one year. Seagate has transitioned from average measures to percentage measures.
MTBF quantifies the probability of failure for a product, however, when a product is first introduced: this rate is often a predicted number, and only after a substantial amount of testing or extensive use in the field can a manufacturer provide demonstrated or actual MTBF measurements. AFR will better allow service plans and spare unit strategies to be set.
Hard drive reliability is closely related to temperature. By operational design, the ambient temperature is 86°F. Temperatures above 122°F or below 41°F, decrease reliability. Directed airflow up to 150 linear feet/min. is recommended for high speed drives.
The failure rate does not include drive returns with “no trouble found”, excessive shock failure, or handling damage.
Here is an example excerpt from a Product Manual, in this case for the Barracuda ES.2 Near-Line Serial ATA drive, which we installed in a backup server at Kana’s datacentre:
The product shall achieve an Annualized Failure Rate – AFR – of 0.73% (Mean Time Between Failures – MTBF – of 1.2 Million hrs) when operated in an environment that ensures the HDA case temperatures do not exceed 40°C. Operation at case temperatures outside the specifications in Section 2.9 may increase the product Annualized Failure Rate (decrease MTBF). AFR and MTBF are population statistics that are not relevant to individual units.
AFR and MTBF specifications are based on the following assumptions for business critical storage system environments:
- 8,760 power-on-hours per year.
- 250 average motor start/stop cycles per year.
- Operations at nominal voltages.
- Systems will provide adequate cooling to ensure the case temperatures do not exceed 40°C. Temperatures outside the specifications in Section 2.9 will increase the product AFR and decrease MTBF.
1.2 million hours MTBF? I’d have expected that kind of lifetime from an older hard drive, from when they were made to LAST, from the days of manufacturers like Connor and ExelStor, but you certainly won’t get THAT kind of running hours from a modern drive, certainly not 1.2 million hours CONSTANT running!
No Comments »