Entries from April 2006 ↓

The Future of Compilers: Saving Millions of Developer Hours

We were troubleshooting an application a few days ago and, after realizing we’d discovered a design flaw, my co-worker Jeremy quipped “It’s not working like it’s supposed to.”

This got me thinking about the nature of errors in software and brought me all the way back to my early computer science courses at UC Davis.

Compiler errors are the best kind of errors. They happen early, they happen often, and they’re easily fixed. If the compiler chokes on my lack of a semicolon I can fix it in about 10 seconds (total time from beginning of compilation to applied remedy). With background compilation in tools like Visual Studio .NET and Eclipse, it’s less than that.

Runtime errors are a bit nastier; they creep up on you once you’ve started the app and *bam* knock you to the ground with a nasty crash. With a runtime error you must take the time to compile the code, start the application, begin interacting with it, and at some point watch it crash.

If you know what the problem is you can fix it in 30 seconds or less, but the additional time spent finding the bug, time that would not have been spent had the compiler found the bug, makes runtime errors creep into the “over 1 minute” mark on a good day. On a bad day you either have to step through the code (if you have that capability), look in the error log, or start adding “print” statements here and there until you figure out where your leak is. Runtime errors are nasty, but at least you have a clue where your problem is located.

Logic errors are the worst of the worst. These include design errors caused by faulty architecture or flow diagrams, or a miscalculation by the programmer. Nothing crashes. Nothing dies. And that’s the worst part. Logic errors are beasts, and are not typically discovered until much later in the process. In an application with any kind of complexity you often need to trace through multiple files, and may even scour through the database looking for bad data. Yes, logic errors are the “I spent 6 hours only to realize I had the minus sign in the wrong place” kind of errors.

Point #1: The earlier you find an error the less time it takes to fix.

Strong Typing Is Your Friend
It follows logically that any error caught by the compiler saves you gobs of time. This is why I’m such an adamant fan of strong typing, where variables can only contain one type of value, such as a string or integer. Languages like Java and .NET follow this pattern. In languages like Perl and PHP you can stuff pretty much any value into anything and it won’t crash while being compiled (or interpreted), but will tend to generate a runtime or logic error somewhere down the line. [Thanks to Ben for the correction on static vs. strong typing (see comments for more information)]

Although the flexibility of weak typing is appealing for cranking out small projects very quickly (since you’re allowed to be pretty sloppy), the payback on the debugging side tips the scales much in favor of strong typing for anything larger than a small application.

Point #2: Use strongly-typed languages. The more the compiler catches, the less time you spend debugging.

If I Only Had A Brain
Given that compiler errors are so dreamy and logic errors so ghastly, it begs the question: why don’t we build compilers that catch more errors; why don’t they find runtime and (heaven forbid) logic errors while they’re in there mucking around with code. Wouldn’t it be amazing if a compiler told you “You’re missing a semicolon, and I think you’re balances aren’t going to add up because you’re missing a minus sign”?

The short answer is because it’s really, really hard. Until data is pulled from a database, a file is opened, or user input is given, the compiler has no idea which paths the execution might follow.

Although compilers have become smarter over the years (VS.NET warns me when I’m doing a number of stupid things), they haven’t reached the point of employing Artificial Intelligence (AI) in an attempt to see the future. Compilers operate with no a priori knowledge of the code beyond the language syntax definitions, and it’s nearly impossible to determine how a person, file or database is going to interact with an application until it actually happens. But then again, it’s nearly impossible to provide accurate search results over billions of documents in a fraction of a second to millions of simultaneous users, and I think there’s a company out there making a few bucks doing that these days.

This problem is hard, but not unsolvable. Computer languages are little more than semantic rules governing communication between developer and machine; semantic rules that are very similar to our written languages such as English, Spanish, German, etc… If Google can build a tool to translate websites through gobs of pattern matching (a 20 billion word corpus), why can’t we build a compiler that uses a hundred million lines of code as a corpus, learning the basic patterns and using them as templates by which to make judgments on code it’s compiling? Since software is based on recurring patterns (open a file, loop through records, parse some things), this is a realistic approach.

From Joel on Software: “A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. ‘Google uses Bayesian filtering the way Microsoft uses the if statement,’ he said. That’s true.”

Compilers these days use the if-statement approach, isn’t it time we move to a higher level of abstraction?

Point #3: If we could build a compiler to catch runtime and logic errors, we would save millions of developer hours per year.

How much is that worth?

New Article – The Future of Compilers: Saving Millions of Developer Hours

“Although compilers have become smarter over the years (VS.NET warns me when I’m doing a number of stupid things), they haven’t reached the point of employing Artificial Intelligence (AI) in an attempt to see the future. Compilers operate with no a priori knowledge of the code beyond the language syntax definitions, and it’s nearly impossible to determine how a person, file or database is going to interact with an application until it actually happens. But then again, it’s nearly impossible to provide accurate search results over billions of documents in a fraction of a second to millions of simultaneous users, and I think there’s a company out there making a few bucks doing that these days.”

From my new article The Future of Compilers: Saving Millions of Developer Hours.

Read the complete article here.

"We Became a Technology Company and No One Noticed."

Thanks to Jeremy for the insightful quote.

Whether it’s network up-time, software that runs without ceasing, or the IVR System handling thousands of calls per hour without blinking, our enterprise hinges on the IT department’s ability to perform in real-time. I guess it’s always been this way, but it didn’t seem like it a few years ago.

When I started working for my current employer there were less than 100 employees, and only five developers. We had several apps running in a call center and a lot of back-end stuff stored in a SQL Server database, but overall we were a small, fast-paced IT shop inside a small, fast-paced startup. We were a small piece of the much bigger company around us, and although our work was valued, it was second or third in line behind increasing sales, growing market share, and all those other things startups worry about. Outages were rare and dealt with swiftly, but they were far from catastrophic.

Today we have 15 developers and are looking to double that in the next six months (if you live in L.A and have .NET experience, contact me). We’re a software machine cranking out everything from desktop apps to TCP/IP listeners. Without our software churning night and day the company would come to a grinding, screeching, mind-numbing halt.

We’ve been ushered out of the league of a startup, where you can throw something into production and hope it works, to a legitimate corporate entity where downtime runs in the 4 or 5 figures per hour. I can’t think of another piece of our organization that has such a real-time impact on our bottom line.

Although technological innovation is not how we make our money, technological prowess and consistency sure is. We may be a financial services company to our customers, but we’re really a technology company at heart.

At some point we became a technology company and didn’t even notice.

Want to Become a Millionaire? Be an Entrepreneur…

I’m listening to The Millionaire Mind on CD during my mini-commute (I can’t, in good conscience, call it a commute when people I work with drive 2-3 times longer than I do). The Millionaire Mind is one of those books, much like Freakonomics, Good to Great, and The Millionaire Next Door (the prequel to the title at hand) that’s based almost solely on cold, hard data.

Thomas J. Stanley, a researcher who has studied the wealthy in the United States for almost as long as I’ve been alive, discovered some amazing trends in his data, which led to the conclusions he shared in The Millionaire Next Door. Things like:

  • 50% of millionaires have never spent more than $399 for a suit, $140 for a pair of shoes, or $235 for a wristwatch.
  • Most millionaires have never spent more than $31,900 on a motor vehicle.
  • Only a tiny fraction of millionaires inherited their wealth; most became wealthy in one generation.
  • Although self-employed people make up 20% of the workers in America, two-thirds of the millionaires are self-employed.

The gyst of his first book is that people who appear wealthy, the lawyer down the street with the $3,000 Armani suit, or the Physician driving the $100,000 sports car, are most often Under-Accumulators of Wealth, meaning they earn a lot but their net worth is small due to purchasing habits. Page after page of data supports his claims.

In The Millionaire Mind, Stanley takes it one step further and delves into the educational backgrounds of millionaires, how they invest, their family life, and on and on…

One of the key points is that becoming a millionaire is much more likely if you own your own business. What’s even more interesting is almost all of the millionaire entrepreneurs he surveyed run what he calls “dull-normal” businesses; they are welding contractors, auctioneers, rice farmers, pest controllers, coin and stamp dealers, and paving contractors.

This blows the lid off theories that you need to be a basketball star, stock market maven, or famous actor to become wealthy. In fact, since these occupations demand that you spend a large portion of your income to keep pace with your peers, they often results in people with seven or eight figure incomes winding up with a very small net worth. Look at MC Hammer or Michael Jackson – both made millions upon millions of dollars from their music careers and have experienced serious money troubles once the millions stopped rolling in.

Another key point Stanley makes is that you don’t have to be at the top of your class to become wealty. Of the millionaires surveyed, only 2% of them were in the top 1% of their college class. The average GPA is a modest 2.92 on a 4-point scale. Stanley’s theory is that wealth is generated much more by creative and practical (common sense) intelligence, rather than analytical intelligence (book smarts). Hard work and discipline also rate very high on the scale of needed attributes.

I can’t recommend these books enough for those interested in entertaining, yet sound theories on wealth and the wealthy.

Build a Game in a Week, Google Buys a Search Algorithm, and more…

This guy built a role-playing game in 40 hours without using any game engines. It has an old-school look and feel, but it’s impressive for 40 hours of work.

Google bought a search algorithm invented by an Israeli student. The algorithm, called Orion, creates a list of topics related to a search based on click-history.

And, if you need some attractive, free icons for your web application (I suppose they could work for a desktop app, as well), check out these Mini Pixel Icons.

Best Software Writing II Nominations

Last year, working with Apress, Joel Spolsky put together a book called The Best Software Writing I full of articles about software development from around the web. Volume 2 will be out later this year, and Joel is looking for nominations of good writing about software.

I would love to get a Software by Rob article in the book. You can Nominate your favorite articles by posting links to Joel’s discussion group. All nominations must include the nominator’s full name and correct email address to be considered.

If you’re a fan of Software by Rob, please take 30 seconds and post a link to one of my articles on his discussion board.

Here are the most popular Software by Rob articles from the past year :

  1. Software Training Sucks: Why We Need to Roll it Back 1,000 Years
  2. Timeline and Risk: How to Piss Off Your Software Developers
  3. Why Expectations Can Kill You and What You Can Do About It

An April Fool’s Day Joke from Google

An April Fool’s Day joke from Google.

The barbecue is hilarious.