Tech rant

Oct. 21st, 2003 06:27 pm
tugrik: (Default)
[personal profile] tugrik
This one is for the programmer and systems types, as it's a bit of a tech-nerdy rant.


So there I was, pulled into a project that's months along and 80% done, to help them solve a few problems that were keeping the last 20% from finishing. The project is an HP Openview based monitoring system that we're going to use to monitor our network -- and we're also going to give a subset of it to our customers. The first problem they present sticks up like a loose thread. A quick exploratory tug suddenly makes a huge section of the project's fabric unravel.

The next one is even worse.

Within 30 minutes of very heated chatter we've pretty much invalidated 3 months of work from very expensive contractors and a handful of in-house engineers. What happened is best described as utter inelegance caused by engineers seeing things from a coding standpoint only and utterly ignoring how the project looks from the user and customer standpoint. They created an almost unusable interrface: it gives an incredibly complete view of every single parameter in our network (in a way the most anal of anal retentives would love) while managing to obfusciate what a user of the project would need to know. Compare it to a full description of how to build an engine, down to the torque of every single screw, while failing to tell the user (driver) of the car "insert key and turn" to start the car.

When I called them on it every single one of them got their hackles up as if I was attacking their children. Most of them attributed my concerns to "I just don't understand". As I wasn't their particular brand of engineer I couldn't possibly see the Truth(tm), and so I, just like the users to come after me, would have to be browbeaten into submission instead of daring to ask them to rewrite things to where it was actually usable. Unfortuantely for them I do know the intricate ins and outs of what they were doing and I know how the users need to access the data, so each time an attempted brow-beating began I was able to pluck it smooth and throw it right back at them.

It got ugly, yeah.

Still, I'm backing off. My complaints are noted, and in fact I've started rebuilding their project from scratch from a user perspective, to show the bossmen later when the one they're doing bombs in their face. It's just sad to see so much time and money wasted to do little more than preserve the ego of folks who refuse to see things from a "lowly" customer or end-users's POV. Ahwell. It just leaves me frustrated. It also leaves me with the urge to just dive in and do it all myself, which given my workload with the rest of the network is utterly impossible. Sure, I can make a demo system to show how it SHOULD be done (as I just mentioned), but not a full implementation.

At the same time, I'm desperately wishing someone would prove me wrong. If I'm the one being short-sighted and can't see the forest for the trees, then I hope someone can coherently point this out to me so I can learn from it and help solve this turkey of a problem. My ego will gladly take the knocking-about and in fact would benefit from it. When you design complex systems that can get bigger than you can handle alone, being questioned and proven wrong is a wonderful thing.




As an example for the tech-head types, here's one of the bass-ackwards things in their design:

One item being monitored is a reference station. To operate, a reference station (depending on its location and duties) will have a subset of 6 services running. Each service performs a particular task and should always be running. When one service goes down you have a set amount of time (depending on the service) before Everything Is Bad. So it stands to reason that the monitoring system should check that each service is running and when they fail inform you which service went down.

Instead, they simply have a variable saying "this machine runs X services" (say, 5 of them). If one or more dies, their monitor goes "Hey! you're running less than (X) services!". It doesn't tell you which one died. It doesn't record any historical data of how long each one was down. It simply says that not enough were up.

It 'costs' more, network and coding wise, to do it their way. It presents a world less data. It's useless for the poor tech on the other end of the pager who gets "uh, not enough services running". They have to go inspect the server, check each and every service, and restart/repair any that failed. If they'd done it the right way the tech would get a "Service so-and-so is down". They'd know exactly where to look, exactly what to restart... and in fact, the monitoring system could even attempt to restart things itself, providing basic self-repair automation.

I suggested this, and it caused a screaming match between four engineers that strayed from english to russian to korean and back, each time getting more personal in the attacks. They calmed down only after they'd successfully quashed any chance of productive conversation... and the next item in the meeting was moved on to as if nothing happened.




I liken the whole thing to a road trip gone wrong. It's like asking those engineers how best to get from San Fran to LA in your car. One of them determined you might get better gas milage by driving in reverse... so they suggest going backwards down the highway while sitting on the hood of the car, windshield smashed through, steering with your feet and accelerating/braking with a jury-rigged contraption of popsicle sticks and rubberbands bought at a truckstop.

When you suggest "why not just put the car in "drive" and head down the highway" you get attacked -- because you can't possibly understand how to efficiently use your car like those engineers you asked can.

Hrmf.
This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

Profile

tugrik: (Default)
tugrik

March 2010

S M T W T F S
 1234 56
78 910 111213
1415 16 17 181920
21222324252627
28293031   

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 24th, 2026 02:50 am
Powered by Dreamwidth Studios