Chandra Ramachandran: September 2009

Tuesday, September 29, 2009

Java Fork/Join Framework

In this paper, the authors demonstrate the feasibility of developing a pure Java implementation of a scalable, parallel processing framework. It is expected that JDK 7 will include this framework in its functionality. Fork/ Join algorithms are similar to divide-conquer algorithms and include a series of steps where a goal is broken down into a smaller set of components and the individual components are computed upon and finally the results are merged.

In the experimental results mentioned in the paper regarding the performance of Java Fork/Join versus other similar techniques in other techniques yield some observations which are a bit unclear to me. For example, the results about relatively similar speedups in performance inspite of increase in number of threads. Even though application-specific reasons are mentioned by the authors, it is likely that there are other fundamental reasons for this.

when This framework was announced for JDK 7, there was a huge deal of excitement in the developer community. I am not sure if the framework still commands as much enthusiasm as it earlier especially now that the JDK early access downloads are out. Is the framework incorporated in this release? Can someone check this?

GNU/Emacs

Jim Blandy explains Emacs in a way which makes one feel as if we are provided with a multipurpose knife, given the range of components and functionality present. Although I am never much of an Emacs user (I use vi instead), I understand the significance of the growth and development of Emacs over the years. The development of Emacs has (as the authors have mentioned) has taken a development process similar to that of an Operating System, and this has been very beneficial. With all the latest development tools around, and with the tremendous number of features provided by them, Emacs still remains a popular choice amongst programmers.

Maintainability of a text editor-kind of package is sometimes termed as "editability" and this article shows the apparent advantages of Emacs in this area. I have personally preferred Eclipse as a programming IDE (simply because I started learning it very early). However, with all the benefits provided by Emacs(and the latest release which looks very cool), I plan to switch allegiance very soon.

Our Pattern Language

One of the most prominent aspects of this paper that stands out is the stacked arrangement (in layers) of the language itself. It also gives a reasonably descriptive overview of each aspect of the language (and each layer), and I think serves as a primer on the development of similar languages. By breaking down the system into five different layers, it provides a way for readers to understand the language better. it will be interesting to see the applicability of existing parallel programming software systems to the patterns specified by this language.

The authors have clearly defined the roles played by different kinds of programmers, and from my own personal experience I find this to be a very accurate representation. Even though I am only now being to work in parallel software systems I can relate to the roles played by parallel software developers especially about the handling of concurrency issues and also the load balancing(between processors) aspect of parallel programming. I feel that application development(on existing software systems) is one of the main reasons for the growth of parallel programming and the authors have given enough focus to this aspect in the paper. Overall OPL caters to the needs of different types of programmers (not restricted to parallel programming), and also accounts for system changes thus making this a somewhat flexible pattern language.

Metacircular virtual machines

This chapter is I believe, the most detailed of the the other virtual machine implementations that we have seen before in the course. More now than before, I am able to appreciate the practical benefits of using Java, and its features like modularity, the availability of several user-contributed libraries and the supporting tools. Reading Jikes which is a virtual machine written in Java and hosted in Java's own runtime environment was an eye-opener in itself and is quite different from the other VMs we have encountered earlier in class. One of the interesting features of Jikes which I liked is the extensibility aspect. Among the several other benefits of Jikes which are mentioned in passing, the one which caught my eye was the creation of an entire operating system based on the Jikes virtual machine. I am not sure how feasible this is, and would love to know more about this. Also, more than just an academic experiment, there do not seem to be any other advancements in Jikes (If I am not wrong).

I also came across this paper which describes the use of debuggers for meta-circular VMs. Since Jikes is written in Java and is platform-independent, the authors argue that Hybrid-debuggers which make use of high-level platform-independent details along with access to platform-specific code would be the ideal debuggers for such VMs. Overally this paper was a nice read, though we could have avoided reading more of the implementation specifics and focused on the architecture as a whole.

Wednesday, September 23, 2009

JPC

Up until the development of JPC, there was no pure Java emulators for the PC available in the market. The feat achieved by the developers of JPC is incredible, considering the amazing number of changes the x86 PC has undergone over the past several years. I find the speed achieved by JPC to be all the more remarkable considering the fact they have had to incorporate so many optimizations. The fact that the JPC runs at such a fantastic speed in spite of having to boot up over a 16-bit real mode of an operating system is simply outstanding. I am not sure if the capability to boot up a fully graphical linux would be supported by JPC in the coming months. I'd sure be amongst the first to test it when it does get released.

JPC makes an ideal platform for doing security research considering the number of security features that are built in. Except anti-virus experts to come up with solutions for more and more viruses, worms and other malicious attacks to a PC without the fear of losing their own machines. With the cross-platform capability built in (by virtue of running in the Java sandbox), I think JPC should become very very popular very very soon. I haven't actually checked out the built-in DOS games that are available along with the emulator. Maybe I should do that pretty soon. :)

The Adaptive Object-Model Architectural Style

AOM provides a novel way to develop dynamically configurable applications and for the development of software systems which emphasize flexibility and adaptability to changing requirements. Inspite of the several advantages foreseen for such an architecture, there is also the question of being able to understand so many layers of abstraction which I believe is a disadvantage. I studied some of the related techniques to AOM such as agile development, meta-modeling etc, and I find AOM to be a pretty solid architecture style.

The concept of all properties being stored in the same way in the AOM as compared to more traditional styles which have properties indexed in files etc, is a question asked to Joseph Yoder in an interview, and I could not catch on how the performance of AOM was similar to that of the styles mentioned by the interviewer. Maybe someone could clarify this? In the same interview, there is this interesting discussion on managing the different versions of the object model.

On a more tangential note, there was this blog by Jim Alateras, which talks about several practical applications of AOM, which I didnot know of earlier. He mentions about openEHR which develops open source software in the domain of clinical implementation, healthcare education and so on. Do check it out.

Tuesday, September 22, 2009

Guardian: a fault-tolerant operating system

This chapter gives a very nostalgic trip down memory lane, and has much more of a hardware-themed contents than the earlier papers that we have come across so far. It is interesting to note how Tandem handled the limitations of address space. I am not clear as to why later versions of Tandem considered duplication of memory as an option. Also now that we are on the subject of building reliable systems, it seems to me that all kinds of inefficient solutions to providing component dependability such as freezing a CPU in cases of failure etc, had been under consideration. In a further paper by Lee et al, the causes of software failure in Guardian are analysed in detail. The authors found out that 77% of software failures are caused by software problems themselves. It looks to me that the single failure tolerance of the guardian system is not actually beneficial. Another study found that memory management is the main source of software problems in Guardian. Guardian seemed to have performed better in terms of number of faults, as compared to the pre-existing machines of that time, mainly IBM and VAX. Given the detail to which this chapter analyses the strengths and weaknesses in the design of Guardian, I think it should still serve as a case in-study(inspite of the era in which it was developed) in the design of similar systems now.

Big Ball of Mud

One of the major themes in this paper is the fact that a big ball of mud pattern may not always be detrimental, atleast during the initial stages of the development of a software. For a variety of reasons, such as lack of time or resources, this may indeed be the best way to go, and is prevalent in many software designs. I like the "teachable moment" kind of advice that is given in the paper that these systems must be studied in detail to further understand what they accomplish, and maybe learn lessons, and incorporate the changes in the design of new systems. Technology shifts, for this reason, are good drivers of change for such systems.

As a student programmer with not much of an experience in developing really large software systems, I always find it comforting to know that the system that I am developing my code on is well documented, clean (especially if I am starting work on the system for the first time), rather than being in the presence of a big ball of mud. In this context, I read some of the classic design mistakes posted by Steve McConnell, and I could put some perspective into what the authors are saying in this paper. Among the forces that the authors claim to drive the development of the big ball of mud, I find the cost factor a bit confusing. The question of a quick-dirty project Vs a a well planned-expensive-design one is something which I can never really resolve unless I am involved in a larger more-at-stake kind of work environment. I feel that with small companies the idea of quick-market-entry, quit-exit strategy seems more appealing to cut costs, stay afloat and maybe even rake in some initial profit. In that scenario, this idea is not too bad.

Thursday, September 17, 2009

Xen and the Beauty of Virtualization

Xen is an open source software project that provides high-performance virtualization. It allows multiple operating systems to run concurrently on a single physical computer. So using Xen, we can run different opreating systems at the same time on a single machine. Part of the reasons for the popularity of Xen are the number of different operating systems which are supported and the wide variety of applications that can be run on it. Some of the benefits of Xen which I feel are important are the application mobility support, the security and privacy features which are supported and so on. One of the most useful features in version 3.0 is the support for hardware virtual machines and the support for reusing other open source projects like emulated BIOS and so on.

I am not much of a virtualization groupie, but I have used VMWare before, and it was very useful for me when I wanted to switch between using windows-linux applications and I didn't want to run Cygwin in my lab machine. I haven't tried Xen but I will do so, now that I am very impressed by the inherent advantages that it possesses like the architecture built on the conept of distrust and so on. The second point is that Xen has benefited widely from its use of the Linux operating system (especially in its early days),and with the support from major players in the market, it has only grown in stature.

Layered Architectures

I like the description of the layered architecture provided in this chapter, and especially the parallels drawn with networking protocol stacks. Amongst the benefits provided by this type of architecture, I feel that the most important one is the ability to shield higher layers from the changes in lower layers provided that only the lowest levels, typically the hardware are subject to change. This type of architecture is more suitable for a stable networking system. As Steven says, this design pattern provides a way to decompose large systems that provide collaborating objects, and for consolidating segmented interfaces.
Reusing of functionality is also one of the benefits which I feel is important and representative of this architecture. However, the choice of implementing this architectural pattern is entirely upto the design team. This can be easily explained to them and it is also easy to demonstrate how each layer fits into the overall picture. In theory all of this seems very good, but it would require some efforts on part of the organization to get it right the first time. To quote Eric Evans, "If an unsophisticated team with a simple project decides to try a model-driven design with layered architecture, it will face a difficult learning curve". In a related paper, a study of the usage of layered architectures in the industry was conducted and they concluded that any number of layering diagrams were possible for a particular software architecture. The authors argue that none of the architectures under study made use of multiple layering criteria in their design. Overall, I think the status and meaning of the word "layered architecture" has undergone a lot of changes since the days of the OSI - 7 layer model and is evolving rapidly today.

Tuesday, September 15, 2009

Data Grows Up: Facebook

In context of this chapter, I was reading an article by Facebook VP of Technical Operations Jonathan Heiliger on how to manage the growth of users/data while rolling out features on a regular basis. He mentions that for any organization to be successful it needs to be embrace change rapidly. He also mentions that unlike other organizations, Facebook does not have QA. They simply have a "cradle-to-grave" lifecycle of their code. Coming back to the chapter, I would like to comment on the application platform and the churn cycle of applications developed for Facebook. In a recent presentation, it was mentioned that nine of the top fifteen applications for Facebook are new. But on the upside, there is a decline in the churn trends for the top few spots due to Facebook's efforts to control application spam.

Statistically, the Facebook API has been extremely popular, with nearly 12000 applications produced since the launch of the API. I think that the main reasons for its success are its openness (in deciding what customers want), its targeted audience, the lure of potential riches to developers whi can decide how to make their application into a good business model without much interference, by having a lot of features which drive "viral" growth and so on. Inspite of the privacy concerns (which are anyway present in most social networks), the API has proven itself to be a strong driver to the growth of Facebook over the last year. Among the several features in Facebook mentioned in this chapter, I would like to single out FQL for the advantages it possesses: reducing response size, providing a common syntax for all methods and for condensing Facebook queries.

Thursday, September 10, 2009

Resource-Oriented Architectures

This chapter discusses about resource oriented architectures as an effective way to wrap reused code, services and so on with named interfaces which help prevent the leaking of implementation details. Among several new innovations proposed in the REST style were the way requests were processed. Earlier requests were processed by putting into a structured package where it can forwarded and processed by participants in a workflow. This was like a contractual obligation which contrasts sharply with the REST style. One important aspect of the resource oriented architectures is their focus on states of a data model. Among the several features mentioned, I think the idea of logically separating the concerns such as the nouns, verbs and representation appealed to me a lot. In addition to the characteristics mentioned, aspects like keeping multiple copies of the resource data, using publish/subscribe messaging systems and so on should be mentioned deserves more attention as well. Among other benefits of using ROA, there are several other things which I liked, and the major ones being the ability to extend by accounting for new representations and URI, the reliance on widespread web standards and so on. While browsing for other resources, I came across this interesting presentation which described a board game (Chess) remade using ROA. I do not think this is novel, but the presentation is excellent. D0 check it out.

Tuesday, September 8, 2009

ArchJava

This paper describes ArchJava, which is a backwards compatible extension to Java code used for integrating software architecture implementations into Java implementation code. Communication integrity and seamless integration of architecture and implementation are explained to be the main contributions of this work. I feel that one of the most important aspects of this work is the ability to ensure that both architecture and code are kept consistent as they evolve. A reasonably good attempt at program understanding has been made in this paper even with all the limitations specified.

The paper considers the length of code to be a good indicator of complexity of the underlying program and even though there could be other factors as well, the methodology adopted to evaluate the effectiveness of ArchJava is sound. As I am unfamiliar with this topic, I cannot comment on whether ArchJava has been successfully implemented on larger pieces of code. Among the advantages, I feel that the most important ones are the ability to explicitly list the method call communication between components which makes it complete. On the other end of the scale, even though the task of rewriting a program to make use of ArchJava is not shown to be complicated for the system that they have considered, it cannot be generalized everywhere. I feel that rather than working across a wide variety of programs across the board, it is too restrictive and focuses on projects who adhere to a standard set of specifications.

Making memories

Before anything else I would like to say that I was lost in the way the discussion segued from a discussion about the workflow of LPS to the author's frustation with the directory structure of different deployment scenarios. Also in the same vein, it is not clear why the development has to be tied down to agile software development and the reasons for it are not clear. Even though I am not an expert with Spring, there is a general consensus that Spring is useful when patterns are present. If a particular problem fits a pattern then we could use it otherwise it is not advisable to look for ways to apply all possible patterns. To comment on the overall theme of this chapter, I like the way the entire description has unfolded from a discussion about the fundamental forces governing the creation center architecture like the business and its context.

I also like the philosophy of "Fail Fast, Fail Loudly" that has been applied in the render engine. In the UI design I think that the policy of separating the visual appearance of a screen from the logical manipulation of its properties was a good decision in its implementation. In addition the ability of forms to capture not only type-value relationships but also metadata is an important factor contributing to the clean model of the user interface. I also agree with the author's assessment that the property-binding architecture was one of the significant aspects of the architecture that was developed in the work. Finally, I believe that the selling point of the experience involved was the ability to "sell" the software to studio executives with much of the underlying complexity or "plumbing", as the author puts it, removed.

4+1 Views

This is a unique paper which presents a way to describe the architecture of software-intensive systems. One of the important aspects of this work is its implementation of several large projects, and the ability of various stakeholders such as system architects to look at it from different perspectives. The focus on addressing large implementations of projects where a single blueprint may not suffice is a selling point of this paper. The seminal nature of this work and its implementation over a long period of time makes me dig out practically no demerits of this work.

I also felt that example for the logical view in Figure 3a showing the logical blueprint of Telic PABX was a bit unclear. Furthermore, the process view seems to be very well suited to take into account the non functional requirements etc, which are an important aspect of software intensive systems. A systematic understanding of the system is obtained by the intentional suppressing and displaying of specific information in each view. Practical aspects in the development of software intensive systems are also taken into account using the collaboration diagrams and overall I really have nothing but a deep sense of admiration and praise for this work.

Architecting for Scale

This chapter of the book describes the experiences in developing Project DarkStar, as a response to the changing needs and requirements of massively multiplayer games in terms of scaling and performance. I like the way a paradigm shift in programming for standalone PCs to a large number of servers and clients has been explained in a retrospective kind of way. Among the various services provided in the DarkStar protocol stack, I liked the way communication services has been designed and developed. By not revealing the actual endpoints of the communication between client and the server (A practice common in many peer-to-peer systems), the issues of scalability and changing the system components are handled well.

The Data Service has also been defined well, and the requirements of the Data Store though different from that of a regular database are clearly defined. Moving on to the question of testing the performance of such a system, I think this paper
"Marios Assiotis, Velin Tzanov: A distributed architecture for MMORPG. Netgames '06: page 4" describes one method. Overall, the paper is a good example of a system which allows the development of distributed, threaded games while providing a simplistic programming model.

Chandra Ramachandran