I once fixed a bug with a similar timeframe. Many attempts by many people over the years. One day I finally fixed it and thought I'd let the user who'd raised it know that it was finally resolved. Unfortunately he'd died of natural causes in the meantime. That really hit home how long it'd taken.
I fixed a similar long-scale bug in... my parent's kitchen extractor fan about a year ago. They had a fancy custom-designed stainless steel extractor fan installed in their new kitchen about 12 years ago. The fan worked but it had one nasty habit: sometimes the controls would freeze up when the halogen lights in the hood were switched off. To get the thing to work again the power needed to be disconnected by either pulling the plug - in the topmost kitchen cupboard behind a hardboard shield, i.e. not really feasible - or by turning off the power to the whole kitchen for a few seconds. This became a regular thing: start cooking, press the button on the hood to switch the lights off, notice that the display got scrambled again and that it no longer responds to button presses, walk to the fuse box, switch off the power to the kitchen, switch it on again, continue cooking while cursing the damned thing. The installer came by and replaced the control board to no effect, the thing still kept on freezing every now and then. I had my suspicions about the cause - halogen transformers can produce nasty spikes when turned off, surely they put some protection in place? - but my father did not want me to pull apart the thing because the 'experts' had already looked at it. OK, fine, it is your problem after all - I live about 1300 km to the north of where they live and came by some 4-5 times per year - so I won't.
In 2019 my father died. My mother was quickly diagnosed with Parkinsons disease and could not really cook for herself any more so I increased the frequency of my visits and cooked enough food in a week to last her a month, freezing everything in portions. This meant I got to spend more time with the extractor fan and I decided to pull it down to see if my hunch was correct. Well, it was, more or less. The installer had installed a ferrite bead but he clearly did not understand why and where that bead was to be installed so he put it on the incoming power line. That's fine if the interference is expected to come from the outside or if the locally produced interference is to be kept from travelling down the line to the network but it does nothing to protect the control circuit from the interference coming from that big transformer which is connected directly to it. Swapping the ferrite bead from the incoming power line to the transformer power line took maybe a few minutes. The thing never locked up again, unfortunately my father never got to see the thing working as intended.
This is also a typical one in electronics, very few people understand how ferrites work, the frequency bands where they are effective and the difference between common mode vs differential mode.
I’m curious why they didn’t deploy diagnostics in the field if they couldn’t replicate in the lab?
Every few months for 7yrs is a lot of opportunities to iterate on collecting field measurements. And it could be done in a holistic way that doesn’t break the safety certification.
This equipment was deployed in remote places without any kind of connectivity, sometimes not even cell coverage.
But the real problem is that the frequency of this failure in a single device was much lower than that. There were hundreds of these devices deployed and we never had one particular unit that was triggering all the time, sometimes here, sometimes there. Really a nightmare.
Yeah, for sure if I was perfect I wouldn't make mistakes, but I am not.
However, I have to disagree in one aspect, it is not basic electrical engineering practice to add filter capacitors on every node. Some nodes need it, some don't. This one, by the topology of the circuit didn't, but from the need for immunity did. This is not always obvious, but for sure I could have anticipated.
Well, you're better than me than. Mistakes like those are @$%#ing expensive, so they have me questioning WTF I'm even employed sometimes. I wasn't the only one looking for the issue so maybe you're right, but it feels really bad.
I know exactly what you mean, I've been there myself so many times that I have learned be humble just for this one reason. I have a post about good engineers being humble which is based on this idea.
If you think this way, you are already on a good path. Think that there's a lot of folks that don't even care at all, you do care and want to improve, that sets you apart.
reply