Diagnosing Java code: The future of software development


Search for:	within
		Search help

IBM home | Products & services | Support & downloads | My account

developerWorks > Java technology


	Diagnosing Java code: The future of software development

Contents:

Component-based development

Assertions and invariants

Refactoring tools

Interactive debuggers

Development tools

Meta-level application logic

I bid you a fond adieu

Resources

About the author

Rate this article

Related content:

Diagnosing Java code series

Subscriptions:

dW newsletters

dW Subscription
(CDs and downloads)

General trends to look for in the next few years

Level: Introductory

Eric E. Allen (mailto:eallen@cs.rice.edu?cc=&subject=The future of software development)
Ph.D. candidate, Java programming languages team, Rice University
17 June 2003

In this final installment of the Diagnosing Java code series, Eric Allen discusses some of the current trends in software development and predicts what they may lead to in the coming years.

Loyal readers: I regret to inform you that this will be the last column in the Diagnosing Java code series. I've (finally) finished my Ph.D., and I'm off to the industrial labs to help start a new research program in programming languages.

In this farewell article, let's have some fun and look into our crystal ball. We'll discuss some of the prevailing trends in the software industry and the impact we can expect these trends to have on the future of software development. We'll focus our discussion through the lens of effective software development we've used in this series over the past two and a half years. As always, let's pay particular attention to the crucial role that effective error prevention and diagnosis play in allowing us to manage our increasingly complex digital world.

There are three prevalent trends in computing to consider when examining where the industry is heading as whole. These are:

The explosion in pervasive and wireless computing
The ever-increasing expense of developing reliable software
The continued rapid pace of improvement in computing performance

Together, these trends are reshaping the fundamental nature of software and the engineering trade-offs involved in constructing it. At a high level, the expense of software and the ubiquity of it are pushing us in the direction of greater and more pervasive abstractions, such as powerful virtual machines with well-defined semantics and security properties, allowing us to develop and maintain software more easily and across more platforms.

At the same time, the continued improvements in computing performance enable us to build these abstractions without suffering unacceptable performance degradation. I'd like to take a stab at some of the ways I believe we could construct new abstractions that would help in building the next generation of software products.

Component-based development
In component-based development, software is developed in modules whose external references are decoupled from particular implementations. These modules can then be linked dynamically to construct a full-fledged application. Notice that "external references" includes not just objects that are referred to, but external classes that can be used and even subclassed. (Think of the ways in which various packages refer to each other in Java programming. Now think about decoupling the packages from one another.) In this column, we've already discussed one vision of component-based programming for Java -- Jiazzi (see Resources for the November 2002 column on decoupling package dependencies).

Component-based programming promises two complementary benefits, both of which will become increasingly important as the above-mentioned trends become more and more prominent. First, component-based systems allow for much greater reuse. For example, consider the myriad programs today that provide text-editing support (mail clients, word processors, IDEs, and so on). Also, consider the numerous clients that provide support for handling e-mail. Despite the number of programs offering these services, few programs handle e-mail as well as a dedicated e-mail client. Likewise, no mail program will allow text manipulation at the same level as a dedicated text editor. But why should every mail client (IDE, word processor, and so on) have to develop its own text editor? It would be so much better if there were a standard "text-editor" API that could be implemented by various third-party components. Tools such as mail clients could choose their favorite implementation of this API and plug it in. In fact, one can even imagine users assembling their environment from off-the-shelf components (such as, their favorite editor, their favorite mail client) that could be linked dynamically when an application is run.

Another benefit of a component-based model is the potential for greater testing. In the Java language as it exists today, external references to classes, such as the I/O library classes and the like, are hard-wired references that cannot be altered without recompilation. As a result, it is difficult to test parts of programs that rely on external references in isolation. For example, it is difficult to test whether a program is making proper use of the filesystem without actually allowing it to read and write from the filesystem. But reading and writing files in unit tests slows tests and requires adding more complexity (like creating a temp directory and cleaning up files after use). Ideally, we would like to separate programs from external references to the I/O libraries for the purposes of testing.

There are numerous ways in which we can formulate a component model. J2EE provides such a model at the level of objects for Web services. Eclipse provides a model for IDE components. Jiazzi provides a model in which independently compiled "units" of software can be linked to form a complete application. Each of these formulations has its use in particular contexts; we should expect to see yet more formulations in the coming years.

Assertions and invariants
Hand in hand with component-based programming must go increased emphasis on assertions and other methods of ensuring that the invariants expected to hold for a component are actually met. The type system by itself is not expressive enough to capture all of the intended invariants. For example, we shouldn't expect that the method types for a text-editing API would capture such invariants as "only opened documents can be closed." We can rely on informal documentation to specify such invariants, but the more invariants we can formalize and check, the better.

Ensuring that the requisite invariants are satisfied is a necessary aspect of component encapsulation. In general, a client programmer will have no way to reason how a component will behave other than what is said in the published API. Any behavior of a component that isn't included in the API is not behavior that the client programmer can rely on. If non-published behavior results in a run-time error, it will be exceedingly difficult for the programmer to diagnose the problem and repair it.

There are several research projects underway to significantly improve the sorts of invariants we can specify for a component. Some of them, such as Time Rover (see Resources for the July 2002 column), use modal logic and other logical formalisms to express deep attributes of run-time behavior.

Another approach to expressing invariants is to bolster the type system with generic types, types parameterized by other types (the topic of our the most recent series of articles in this column).

Yet another approach to adding much more powerful invariants is that of dependent types. Dependent types are types parameterized by run-time values (compare this with generic types, which are parameterized by other types).

The canonical example of dependent types is that of an array type parameterized by the size of the array. By including the size in the type, a compile-time checker can symbolically analyze the accesses to the array to ensure that no accesses are done outside the bounds of the array. Another compelling use of dependent types is that of ownership types, developed by Boyapati, Liskov, and Shrira (see Resources for a link to their original paper).

Ownership types are types of objects that are parameterized by an owner object. For example, consider an iterator over a container. It is natural to say that the iterator is owned by the container, and therefore, that the container has special access privileges to the iterator. Inner classes provide some of the same controls over access privileges, but ownership types provide a much more powerful and flexible mechanism.

Continued improvements in refactoring tools
As software applications become larger, it becomes increasingly difficult to maintain and improve code bases or to diagnose bugs. This problem is exacerbated by the scarcity of qualified developers. Fortunately, development tools are providing us with increasingly powerful control over software systems. Two of the most powerful forms of control are unit testing tools and refactoring browsers.

Unit testing tools allow us to check that key invariants of our programs continue to hold under refactoring. Refactoring browsers provide many direct and powerful ways to modify code while preserving behavior. We are starting to see "second generation" unit testing tools that leverage static types and unit tests to mutual advantage, allowing for automatic testing of code coverage and automatic generation of tests. Refactoring browsers are adding more and more refactorings to the standard repertoire. Long range, we should look for even more sophisticated tools, such as "pattern savvy" refactoring browsers that recognize uses (or potential uses) of design patterns in a program and apply them.

We can even expect development tools to eventually leverage the unit tests to perform more aggressive refactorings. In Martin Fowler's classic text, Refactoring: Improving the Design of Existing Code, a refactoring is defined to preserve the observable behavior of a program. However, in general we are not concerned about all aspects of the observable behavior of a program; instead, we generally care about maintaining certain key aspects of the behavior. But these key aspects are exactly what the unit test suite is supposed to check!

Therefore, a refactoring browser could potentially leverage the unit test suite to determine what aspects of behavior are important. Other aspects could be aggressively modified by the refactoring browser at will, in order to simplify the code. On the flip side, the functionality of such a refactoring browser could be leveraged to check test coverage by determining the kinds of refactorings that are allowed by the unit tests and reporting them to the programmer.

Interactive debuggers
As applications become more complex and increasingly run on remote platforms, diagnosis of bugs takes on whole new challenges. Often, it's not possible or practical to debug software on the deployment platform. Ideally, we'd like to be able to debug software remotely.

Java Platform Debugger Architecture (JPDA) provides for exactly such a facility by allowing a debugger to run on a separate JVM; then we can use RMI for the remote debugging session. But, in addition to remote debugging, diagnosis can be made much more efficient by giving the programmer more control over the access points available when starting a debugger and the available views of the state of the computation during debugging.

Even with modern debuggers, programmers still have to resort to printlns in many contexts to get the information they need. Ideally, we would have debuggers that completely obviated the need for printlns. In fact, at the Java programming languages team (JavaPLT), we are working on just such a debugger. This debugger, due for open source release in Fall 2003, will make use of a seamlessly integrated "interactions window" that allows for incremental code evaluation (see Resources for the March 2002 column). The interactions window lets you start a debugging process through an arbitrary expression evaluation. It can also be used at a breakpoint to interact with the running process in context, accessing the scope visible at that point in the process and modifying it at will. The JavaPLT debugger will be released both as part of the DrJava IDE and as an independent Eclipse plug-in.

Lightweight, interoperable development tools
As development tools become more sophisticated, it becomes increasingly difficult for a single vendor to provide the best of all tools. So, developers tend to rely on a smorgasbord of tools from different vendors. Doing so is most pleasant if the various tools play well together, each accepting the fact that they will be working together with other tools.

Projects like Eclipse take this philosophy one step further and provide ways to interoperate tools to leverage the functionality of one another and provide services beyond what is possible with any of the tools in isolation. With time, we should expect this model, or others like it, to truly "eclipse" traditional all-in-one IDEs.

Meta-level application logic
As our final crystal ball vision, let's consider one direction that software might take in the very long term. Many of the most common bugs that occur in applications take the form of a simple misconfiguration that is easily remedied once the user understands the underlying details of the application. The problem is that most users don't have the time to understand the underlying details of all the applications that they use.

One long-range solution to this problem could be to embed meta-level knowledge into applications that encodes the context in which the application is run and what it is supposed to do. For example, meta-level knowledge for a word processor would include logic explaining that the program was used by humans on personal computers to produce English documents that are then read by other humans. With this knowledge encoded, an application could potentially make inferences about what a user was trying to do when something goes wrong (of course, the application would also have to determine that something was wrong in the first place).

Such meta-level knowledge is potentially a powerful mechanism for adding robustness to an application. It is also extremely dangerous, and what is most worrisome about the danger is that it is often overlooked by the strongest advocates of moving in this direction. After all, the behavior of an application that dynamically reconfigures itself can be extremely unpredictable. A user may find it quite difficult to reason how his program will behave under certain circumstances. And, the developers of the program can also find it extremely difficult to be able to be assured of its reliability. As we've seen time and again, the inability to predict and understand a program's behavior has easily predictable consequences -- buggy software.

To be clear, I really do think that adaptive software with meta-level knowledge about the context in which it's used has the potential to vastly improve the capabilities of software applications, but if we add such capabilities, we must find ways to do so that still allows us to reason effectively about our programs.

A great example of a software system that incorporates a form of meta-level knowledge and adaptability (albeit extremely limited) without sacrificing predictable behavior is the TiVo personal digital recorder (or other similar products). TiVo uses your television viewing habits to adaptively determine what shows you might like to watch, but this adaptability is stringently restricted. TiVo will always follow user directives for the shows to record regardless of any of its adaptive behavior responses. TiVo uses a very simple form of meta-level knowledge, but even as the meta-level knowledge used becomes more and more complex, we should continue to keep control over adaptive behavior. If you'll forgive a somewhat fanciful comparison from the realm of science fiction, we could follow the precedent set by Isaac Asimov. Asimovian robots were extraordinarily powerful machines, but they were controlled by absolute and inviolable fundamental laws, allowing for some degree of predictability in their behavior.

I bid you a fond adieu
And it's on the note of Asimovian robots that I'll choose to wrap up this discussion. I'd like to thank the team at developerWorks for their efforts in the past past two-and-a-half years: editor Jenni Aloi, for giving me the opportunity to write this column; copy editor Christine Stackel, for her excellent attention to detail; and finally, developmental editor Kane Scarlett, for his heroic efforts, which substantially improved the content of this column.

To my readers: I hope you have found some lasting value in these articles. Writing them has been an invaluable learning experience for me. Thank you for your patronage, and I wish you the best of luck in your efforts to prevent and diagnose bugs in your programs.

Resources

Participate in the discussion forum on this article. (You can also click Discuss at the top or bottom of the article to access the forum.)
Read the complete Diagnosing Java code series by Eric Allen. These articles are particularly relevant to this discussion:
- For more on decoupling package dependencies and Jiazzi, read the November 2002 column, "Decoupling package dependencies."
- Read about Time Rover and assertions in Java programming in the July 2002 column, "Assertions and temporal logic in Java programming."
- Examine seven principles to build a base for code design with testing in mind in this article, "Designing 'testable' applications."
- Bolster your knowledge of generic types and the Java type system by reading "Killer combo -- Mixins, Jam, and unit testing" and "The case for static types."
- Read about JavaPLT and a window that allows for incremental code evaluation in the March 2002 column, "Repls provide interactive evaluation."
Get a jump on generics in the Java language by downloading the JSR-14 prototype compiler; it includes the sources for a prototype compiler written in the extended language, a JAR file containing the class files for running and bootstrapping the compiler, and a JAR file containing stubs for the collection classes.
Follow the discussion of adding generic types to the Java language by reading the Java Community Process proposal, JSR-14.
Keith Turner offers another look at this topic with his article "Catching more errors at compile time with Generic Java" (developerWorks, March 2001).
Read all about ownership types for the Java language in this original article on the subject.
You can download the NextGen prototype compiler right now.
Also, check out DrJava, a free Java IDE that supports interactive evaluation of Java statements and expressions, and supports generic Java syntax and compilation.
And don't forget to try the high-performance code-analysis engine for both J2SE and J2EE development, CodeGuide from OmniCore. It already provides IDE support for generic types in the Java language with the JSR-14 prototype compiler.
Martin Fowler's Web site contains much useful information about effective refactoring.
This paper, "Automatic Code Generation from Design Patterns" (PDF), from IBM Research, describes the architecture and implementation of a tool that automates the implementation of design patterns.
Eric Allen has a new book on the subject of bug patterns, Bug Patterns in Java, which presents a methodology for diagnosing and debugging computer programs by focusing on bug patterns, Extreme Programming methods, and ways to craft powerful testable and extensible software.
Find hundreds more Java technology resources on the developerWorks Java technology zone.

About the author
Eric Allen sports a broad range of hands-on knowledge of technology and the computer industry. With a B.S. in computer science and mathematics from Cornell University and an M.S. in computer science from Rice University, Eric is currently a Ph.D. candidate in the Java programming languages team at Rice. Eric is a project manager for and a founding member of the DrJava project, an open-source Java IDE designed for beginners; he is also the lead developer of the university's experimental compiler for the NextGen programming language, an extension of the Java language with added experimental features. Contact Eric at eallen@cs.rice.edu.

developerWorks > Java technology

About IBM | Privacy | Terms of use | Contact