Sunday, December 31, 2017

What is Software?


Ir's a new year, so let's start out with something fundamental, cleaning up something that's bothered me for many years.

The other day I was lunching with a computer-naive friend who asked, "What is software?"

Seems like it would be an easy question for those of us who make and break software for a living, but I had to think carefully to come up with an explanation that she could understand:

Software is that part of a computer system that adapts the machinery to various different uses. For instance, with the same computer, but different software, you could play a game, compute your taxes, write a letter or a book, or obtain answers to your questions about dating.

I then explained to her that it’s unfortunate that early in the history of computers this function was given the name “software,” in contrast to “hardware.” What it should have been called was “flexibleware.”

Unfortunately the term “soft” has been interpreted by many to mean “easy,” which is exactly wrong. Don't be fooled. 
What we call “hardware” should have been called “easyware,” and what we call “software” could then have been appropriately called “difficultware.”

Monday, December 25, 2017

Unnecessary Code

We were asked, "How can I tell if my code does extra unnecessary work?"
To answer this question well, I’d need to know what you mean by “unnecessary.” Not knowing your meaning, I’ll just mention one kind of code I would consider unnecessary: code that makes your program run slower than necessary but can be replaced with faster code.

To rid your program of such unnecessary code, start by timing the program’s operations. If it’s fast enough, then you’re done. You have no unnecessary code of this type.

If it’s not fast enough, then you’ll want to run a profiler that shows where the time is being spent. Then you search those areas (there can be only one that consumes more than half the time) and work it over, looking first at the design.

There’s one situation I’ve encountered where this approach can bring you trouble. Code that’s fast enough with some sets of data may be unreasonably slow with other sets. The most frequent case of this kind is when the algorithm’s time grows non-linearly with the size of the data. To prevent this kind of unnecessary code, you must do your performance testing with (possibly artificially) large data sets.

Paradoxically, though, some algorithms are faster with large data sets than small ones.

Here’s a striking example: My wife, Dani, wanted to generate tests in her large Anthropology class. She wanted to give all students the same test, but she wanted the questions for each student to be given in a random order, to prevent cheating by peeking. She gave 20 questions to a programmer who said he already had a program that would do that job. The program, however, seemed to fall into an unending loop. Closer examination eventually showed that it wasn't an infinite loop, but would have finally ended about the same time the Sun ran out of hydrogen to burn.

Here’s what happened: The program was originally built to select random test questions from a large (500+ questions) data base. The algorithm would construct a test candidate by choosing, say, twenty questions at random, then checking the twenty to see if there were any duplicates among those chosen. If there were duplicates, the program would discard that test candidate and construct another.

With a 500 question data base, there was very little chance that twenty questions chosen at random would contain a duplicate. It could happen, but throwing out a few test candidates didn’t materially affect performance. But, when the data base had only twenty questions, and all Dani wanted was to randomize the order of the questions, the situation was quite different.

Choosing twenty from twenty at random (with replacement) was VERY likely to produce duplicates, so virtually every candidate was discarded, but the program just ground away, trying to find that rare set of twenty without duplicates.

As an exercise, you might want to figure out the probability of a non-duplicate set of twenty. Indeed, that’s an outstanding way to eliminate unnecessary code: by analyzing your algorithm before coding it.

Over the years, I’ve seen many other things you might consider unnecessary, but which do no harm except to the reputation of the programmer. For example:
* Setting a value that’s already set.
* Sorting a table that’s already sorted.
* Testing a variable that can have only one value.

These redundancies are found by reading the program, and may be important for another reason besides performance. Such idiotic pieces of code may be indications that the code was written carelessly, or perhaps modified by someone without full understanding. In such cases, there’s quite likely to be an error nearby, so don’t just ignore them.


Wednesday, December 20, 2017

Which code is more readable?

We were asked, "Which code is more readable, one that uses longer variable names or short ones?" 

Maybe some historical perspective will help answer this question.

In the very early days of computing (I was there), we used short variable names because:

* Programs were fairly short and simple, so scope wasn’t much of a problem.

  • Memories were small, so programmers didn’t want to waste memory with long names.

  • Compilers and assemblers were slow, and long names made them slower.

  • Many compilers and assemblers wouldn’t allow names longer than a few characters, because of speed and memory limitations.

  • We didn’t think much, if at all, about who would maintain a program once it left the hands of the original programmer.

As programs grew larger, one result of short naming was difficult maintenance, so the movement toward longer names grew stronger. It wasn’t helped by COBOL, which asserted that executives should be able to read code. Lots of COBOL code was littered with super-long names, but that didn't help executives read it.

The COBOL argument proved to be nonsense. Still, the maintenance argument for longer, more descriptive names made sense.

Unfortunately, like many movements, the long-name movement went too far, at least for my taste. It wasn’t because long names were harder to write. After all, a typical program is written oncem but read for modification and testing many, many times. So, if long names really made reading easier and more reliable, it was good.

But the length of a name is not really the issue. I’ve seen many programs with long, long names that were so similar that they were easily confused, one with another. For instance, we once wasted many days trying to find an error when the name radar_data_station_#46395_azimuth_reading was mistaken for radar_data_station_#46895_azimuth_reading. Psychologists and writers know well that items in the middle of long lists are frequently glossed over.

So, like lots of other things in software development, long versus short names becomes a tradeoff, a design decision for a programmer for which there is no “right” answer. Programmers must design their name-sets with the same kind of engineering thought they put into all their design decisions.

And, as maintainers modify a program, they must maintain the name-set, so as to avoid building up design debt as the program ages.

So, sorry, there’s no easy answer to this question, nothing a programmer can apply  mindlessly. Just as it’s always been, programmers who think will do a better job than those who blindly follow simplistic rules.



Saturday, December 16, 2017

My First Week in a Software Job

We were asked, "What was your first week like at your first software engineering job?"

In June, 1955, I went to work for IBM in San Francisco. Of course, at that time there was no such thing as "software engineering." In fact, there was no such thing as a "programmer." My title was "Applied Science Representative." I was supposed to apply science to the sale of IBM computers.

I was told that in two weeks I was to teach a course in programming the IBM 650.

That presented a few problems.

  • I had never programmed any computer before.

  • Nobody in the IBM office had ever programmed a computer before.

  • Nobody in the IBM office had ever seen a computer before.

  • There was no computer in the office—just a bunch of punch card machines.

  • In fact, as far as we knew, there was no computer in San Francisco.

I spent the next two weeks in a closet in the IBM office studying all the IBM manuals that were stored there, preparing myself to teach this course. I was pretty much a lone ranger, without the horse or any faithful Indian companion. Actually, no companion at all.

That was over 60 years ago, and now I have a multitude of companions. Even so, it was a special time and an unforgettable first two weeks, so thank you for asking this question.

If you want to know more about what it was like in those thrilling days of yesteryear, you should follow Danny Faught's blog. Back then, we used to listen to the Lone Ranger on radio (there wasn't much, if any, television).

"Hi-Yo, Silver! A fiery horse with the speed of light, a cloud of dust and a hearty ‘Hi-Yo Silver'... The Lone Ranger! With his faithful Indian companion, Tonto, the daring and resourceful masked rider of the plains led the fight for law and order in the early Western United States. Nowhere in the pages of history can one find a greater champion of justice. Return with us now to those thrilling days of yesteryear. From out of the past come the thundering hoof-beats of the great horse Silver. The Lone Ranger rides again!"


<http://www.geraldmweinberg.com (Formerly The Lone Programmer)

Sunday, December 10, 2017

Do programmers really know how to program?

I was asked, "Do programmers really know how to program?"

I believe this question is unproductive and  vague. What does it mean by “program”?

The person who asked this question seemed to think programmers were not really programming when all they did was copy some existing program, using it whole or perhaps pasting it in as part of a shell.

To me, programming a computer means instructing it to do something you want done, and to continue doing it as desired.

If that’s what we’re asking about, then yes, of course, some of us out here know how to program. (Some do not, of course.)

It is irrelevant how we do that. Whether we use genetic algorithms, cut-and-paste, or divine inspiration? Do we use Scrum or Agile or Waterfall? How about the programming language? C++, or Java, or Lisp, or Python, or APL? Well, none of those choices matters.

Then what does matter? How about, "Can we satisfy someone’s desires?" In other words, can we provide something that someone wants enough to pay what it costs, in time or money? That’s what counts, and we certainly know how do that—sometimes.

Sure, we fail at times, and probably too often. But no profession succeeds in satisfying its customers all the time. Did your teachers always succeed in teaching you something you wanted to know? Do surgeons know how to do surgery?

So what about using existing programs? To my mind, the first and foremost job of a programmer is knowing when not to write a program at all—either because the needed program already exists or because no program was needed in the first place.

In other words, not writing a program when no program is needed is the highest form of programming, and one of the marks of a true expert.




or Kindle for the book in paper or ebook format

Wednesday, December 06, 2017

What is the simplest, most amazing code you have ever written or witnessed?

We were asked to describe the simplest, most amazing code we had ever written or witnessed.

My answer should probably be some esoteric APL code that I personally wrote, like inverting a matrix with a single character program, but many of my readers wouldn’t understand it. In any case, modesty prevents me from choosing my own code.

So, instead, let me tell the story that took place long ago when we were installing an IBM 709 in Bermuda, as part of the NASA space-tracking network. The 709 was a “naked” installation, with little surrounding peripheral equipment, and nothing like it in Bermuda to help us.

In particular, we didn’t have an off-line printer or a punch-card duplicator, so we needed to use the 709 itself to do these jobs—but we had no utilities because we were probably the only naked 709 in the world.

My colleague, Marilyn, who was by far the best programmer I ever knew, went to our keypunch (the only unattached peripheral we had), inserted a blank card, and proceeded to punch (in row binary) a card-to-card duplicator program for the 709. She did it as I watched, in a single pass through the keypunch. 

You’d have to understand row-binary format to appreciate what a feat this was—multiple punched columns of alternate instructions in binary. To top it off, she actually punched in (in the same pass) the self-loading program AND the parity check row for her entire card.

She then loaded this card into the 709’s card reader, picked it up and reentered it as input to itself, and so punched a duplicate. She took the duplicate to the keypunch and added one punch to one of the rows. She now had a 709-to-printer program—two incredible error-free programs for the price of one.

I’ve never seen anything like it before or since. Until that time, I thought I was a pretty good programmer. After Marilyn’s feat, I realized that the best I could ever hope to be was Number Two.

How about you? Any amazing code stories to share?