Wednesday, May 27, 2020

Silver Bullets in Software Development

No Silver Bullet
In the 1986 Essay No Silver Bullet, Fred Brooks argued that nothing would provide a tenfold improvement in software development within a decade:
But, as we look to the horizon of a decade hence, we see no silver bullet. There is no single development, in either technology or in management technique, that by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity. 
He divided software development into essential and accidental difficulties:
  • essential ones are required due to the complexity of the problem itself - the software needs to satisfy a "conceptual construct" in a precise manner
  • accidental ones (like determining the correct syntax) are incidental to the problem and can become simpler with better hardware and software techniques.
Accidental difficulties had been reduced to such an an extent by 1986 that Brooks argued most software development dealt with essential complexities that could not be removed. While incremental progress would be possible, revolutionary "silver bullets" were impossible. Brooks reiterated his claims in 1995, but it's worth revisiting again. Have there been any silver bullets since then? How much of software development today deals with essential vs. accidental problems?

Silver Bullets. Photo Credit: Money Metals, Flickr

Software Development Today
On one hand there has been a tremendous amount of progress in speeding up development and in focusing on the essential problems:
  • Google and StackOverflow let one quickly find answers to questions
  • Open source libraries allow for broad code re-use
  • Cloud services like AWS make it easier to launch in production
  • Frameworks like Ruby on Rails provide default assumptions so the engineer can focus on defining the product
On the other hand it seems like much of engineering work today, particularly at large companies, deals with complex issues not connected directly to defining a product:
  • As products grow to encompass multiple teams, applications may be split into sub-applications for each team, but integrating them together adds additional layers of complexity
  • Integration tests involve so many systems that they're a constant point of failure, and often adding or updating a feature can require more time dealing with tests than with the actual code
  • As products grow larger and scale to more users, engineers spend more time on smaller optimizations
The move from desktop applications to the web also added new layers of complexity:
  • Application logic needs to be replicated on both the server and client side 
  • Every language and framework needs to be converted to Javascript, an unusual choice for an "assembly" language
  • Since application data isn't generally stored on the client, latency becomes a constant issue
Depending on where one draws the dividing line, these problems can be considered either essential or accidental. While they do not deal with specifying the product itself, they arise from the size of the teams or from the technologies involved. Software development still deals with both the essential aspects of specifying a product and the many nuts and bolts of making it work correctly in the real world. 

A Silver Bullet 
There is a silver bullet that has completely revolutionized development - machine learning. Brooks had specifically dismissed AI as a silver bullet since back then AI meant "heuristic" or rule-based programming, where each product would still need all its details specified:
The techniques used for speech recognition seem to have little in common with those used for image recognition, and both are different from those used in expert systems. I have a hard time seeing how image recognition, for example, will make any appreciable difference in programming practice. The same problem is true of speech recognition. The hard thing about building software is deciding what one wants to say, not saying it. No facilitation of expression can give more than marginal gains.
Enter machine learning (ML), particularly deep learning with neural networks. Now the same overall techniques can be used for both speech recognition and image recognition. One no longer needs to decide precisely what "one wants to say", one just specifies a goal and given enough data, the neural networks will figure out the details. Systems that involved years of coding before can be replaced with a machine that learns on its own. For example, AlphaZero was able to learn chess by playing itself for a few hours and then it beat the best existing chess software. Programmers had spent decades improving chess software with hand-written heuristics, but machine learning outplayed them all. 

What's next
Despite the amazing progress of ML, most areas of software development do not have enough data to truly benefit from it, so they still have the same overall structure and process as years ago. What then are the next areas of progress?
  • Assisted programming - generating a program from a product definition has always been a dream (even mentioned by Brooks), and there's been recent progress. For the near future, humans are still needed to specify the nitty gritty details in code. But online resources like StackOverflow and Github (besides company's internal codebases) contain enough data that search and ML algorithms will be able to assist in this process. A significant part of programming can be finding an example and modifying it for one's purpose, so even a better search alone will speed up overall development.
  • Much of programming consists of plumbing - connecting databases to an application, determining how to summarize the data, deciding how to display it in a UI. Since some of this is very standardized, companies can choose to use "low code" tools to build them, using products built for that purpose (e.g. from Salesforce) or even just advanced spreadsheets (Airtable). While visual programing does not contain the power and flexibility for building large applications, some products have much smaller scopes.
  • Some application plumbing will no longer be necessary for other reasons - ML will take over optimizing certain goals from humans, so a user interface will no longer be needed. For example, when an ad campaign runs on ML, much less knobs and dials need to be created for users. The system just takes in a budget and perhaps a goal to optimize for. In some cases, developers may still create tools for users to interface with the ML system, but in other cases the system will be a fully automated blackbox. Developing user interfaces might remain the same, but what interfaces are needed will change.
In short, software development will continue to make incremental progress in some areas and add accidental complexity in other areas, while some areas will be completely revolutionized by ML.





2 comments:

  1. When I was a newbie programmer 40 years ago the talk was about 4G programming language that would allow you to code at a very high level. It never happened. What happened was the company decided to buy packages instead. And that was a different can of worms.

    ReplyDelete
  2. A few thoughts:

    1) I totally agree that the accidental aspect should have been solved. And yet, I find it fascinating that every new generation of programmers feels the need to either abandon or re-create existing tools. When you look at things like Atom, VS Code (not Visual Studio), Sublime, Brackets, etc... you sort of have to wonder - why? When you have tools like IntelliJ, Eclipse, and Visual Studio, it becomes very hard to understand why you'd choose those other options - namely because they don't have any of the defensive aspects that help you code. Their autocompletion is not good, their integration with standard languages and tools like compilers, debuggers, profilers, etc... is also very poor, and they are just very weak relative to what is available. And yet, here we are, seeing a yet another generation choosing Atom over IntelliJ. So - its an observation of peculiarity.
    2) I would say that Brooks was right - it is still about the what you want to do with the data, not how to get it. Yes - nowadays, machine learning very much simplifies the process of determining whether what was presented in front of the machine was a picture of a cake, a paragraph about a cake, or a recorded message asking for a cake. 20 years ago it would have taken a team of 100 engineers and scientists many years to do what can now be done in an afternoon. But that isn't the whole picture. The fact that you know that the data in front of you is a cake, doesn't mean your done, the real question is - do you eat it, or do you put it away - i.e., what exactly are we doing with the cake. In the end, that is what humans are still about - making those decisions. Although the classification problem is much easier, we haven't been able to get rid of the problem of what to do with the data once its been defined. In some ways, AI/ML is just another view of the problem. If you think about it, in 1995 when E-Trade went online as a Java Server, it took 100s of engineers multiple years to put together an application that could take a stock order and execute it, and show you the current stock price, along with some news about it. That server was also famously only able to handle 2 queries / second. Today, you can probably ask a college freshman to build something like this in an intro to programming course in Python, and they'd be able to do it over a weekend. What changed? Well - the entire echo system of tools and machinery changed and evolved in a manner similar to the ML/AI situation. Its just a different view of the problem. At the end though, we haven't gotten rid of the problem of "what do I do with the stock data now that I have it?". The mechanics became simpler, but the decisions of what to do with it are just as hard.

    ReplyDelete