Chandra Ramachandran: 2009

Tuesday, November 10, 2009

Branch and Bound etc.

The three patterns in this set are Branch and Bound, Graphical Models and Structured Grids. First, we can talk about Branch and Bound. Amongst the four operations of branching, evaluation, bounding and pruning, I feel that the most important and valuable one would be the application of the objective function to a point in the search space. Evaluation becomes harder for larger search spaces. Pruning a particular branch using this strategy becomes pretty simple as it is quite easy to determine when the minimum value of a linear integer programming problem on a branch is greater than or equal to the current minimum possible solution. I feel that this kind of optimization is also very similar to the Monte Carlo pattern that we studied earlier in the class. The flowchart given in the description demonstrates this pattern very well. The second pattern is regarding Graphical models. Graphical Models is the next pattern in this set. This doesn't have much of a description though. In addition to the examples mentioned, I would like to add Bayesian inference as one of the possible algorithms using this pattern. Code snippets would have been helpful in understanding this pattern for people not familiar to the mechanisms described in the paper. The last pattern is about Structured Grids. We can observe that this is one among many patterns which are applicable to problems in image processing and in computer vision. It is mentioned in the context that the problems which follow this pattern perform updates until a particular error threshold is reached. It seems to me as this is closely linked to the granularity of the tasks to be performed. Overall, these set of patterns were more closely linked to problems that I could relate to in the real world , and were a very interesting read.

Shared Queue, Speculation and Digital Circuits

The first thing that I would like to talk about is shared queue. The first aspect is the context in which this could occur. I have come across many graph mining problems which run serially on single units of execution. Parallelizing this process using the shared queue could be an interesting research area to look into. I completely agree with the notion that more complicated constructs could increase the error-rate of the code structure. In the solution, the most important point would be about defining the shared ADT since the rest of the process would depend on it. Again, as Prof.Johnson mentioned in today's class, this paper has a lot of code, and seems to be better written than some of the other patterns that we read earlier. It will be interesting to read more about the choice of nested locks here. The second paper is about speculation. This seemed to me to be the most interesting of the three patterns. The commit/recovery mechanism depending upon the status of a predicate seems to be the most complicated of the tasks mentioned in the paper. I would love to see more elaboration on the recovery mechanism especially on the description of more intelligent recovery mechanism mentioned in the paper. I searched for more information and examples regarding this pattern but I couldnt find that many.

The last paper is about digital circuits pattern. This paper seemed the one requiring the most introspection among the three. Among the examples given, I could relate most to the databases pattern because I have been working on this but not on the scale mentioned in the paper. For most bit level operations that I have encountered in C, I have had to intelligently use up the remaining bits in a word so as to not waste memory. Could we say that bit level parallelism is an 'embarrassingly parallel' problem? Among the examples presented in the paper I found the description of finding the nth gray code to be very interesting. The fitting of data completely as a word is the key aspect here. Overall, I think the three patterns in this set are some of the most introspective that I have read this entire semester.

Thursday, November 5, 2009

chp 6

This chapter discusses building applications, specifically about the implementation of supervisors in OTP. Amongst the five behaviors mentioned in the chapter, I think the supervisor behavior is the most important of all five. The hierarchical manner in which the system is built makes it easier to develop new releases. The supervision of worker nodes by the supervisor holds some aprallels with other parallel programming languages that I have worked on. The generic_server API and the server example presented later in the chapter make it easier for us to understand the initail process especially about how exactly the arguments and function calls are used to implement the gen_server. The Event manager principles are interesting and I especially liked the description of how the event manager is like a generalized finite state machine. The application developed to illustrate the finite state machine API is useful to help understand the uses of the packet assembler. The application APi has been described in a clear and concise fashion, and the example serves to help us understand the primitive behaviors. From the discussion it is clear that systems built using OTP behaviors have a systematic structure. Overall this chapter was a very interesting read, and gave insights into how such systems can be put into practice.

Part 1 - Pipelining and parallelism

This post is about data parallelism, pipelining and geometric decomposition. First, about pipelining. One of the interesting aspects is regarding the throughput of the data processed by the pipeline. An analogy with a real world example is from my personal experience working on video processing. To render a 3D video from multiple stereooptic cameras, the required video is obtained smoothly only if the video streams at each camera are combined together to obtain the specific frame rate. Although this is not an strictly parallel programming effort, I can find some similarities here. Another interesting aspect of this paper is the point made about the slowest computation creating a bottleneck in the entire concurrent pipelining. Plugging-in of objects or frameworks to the pipeline is another interesting aspect mentioned in the paper. The second part is about data parallelism. I came across this article which talks about data parallelism. Sometimes this method of processing data utilizes the capacity of a multi-core to the maximum extent. At the end of the post, they talk about a real life implementation of this in an advanced German nuclear fusion platform, the ASDEX tokamak. The last part is about the Geometric Decomposition Pattern. I feel that one of the most important elements of the solution is the data decomposition part especially about the granularity of the decomposition. I feel that another important aspect of this is the need to balance decomposition vs code reuse especially in larger systems. Though I have actually worked with this kind of pattern I would love to hear more instance of this, through other real-world examples.

Thursday, October 22, 2009

Chp 5

To add to the definitions of a fault tolerant system(as mentioned in the introduction of this chapter), I was reminded of a research paper I came across a few days back which talked about all software faults resulting from design errors, and the changing paradigm presented in this paper. I like the intuitive way the hierarchy of tasks has been organized and described about, and in many ways it does parallel most computing problems (or even problems we come across in our daily life). The distinction between errors, exceptions etc has been described very well, and I am somehow reminded of the programming model of Java (which I am hugely familiar with) which is different from that of Erlang. The division into supervior and worker processes somehow seems similar to concepts in distributed computing where the goal is divide a task among n processors. Overall, this chapter has been written in a simple and concise manner, and the among other things, the description of well behaved functions stands out in this work.

Summary of Part 1

This post talks gives a review about the dense linear algebra pattern, the monte carlo and the graph algorithms patterns. The first thing to talk about is the dense linear algebra pattern. The problem of maximizing performance while balancing data movement is an interesting one, and I am sure it is similar to the problems encountered in many different areas of Computer Science. Being a student with research interest in Data Mining, I come across many of these problems frequently. One of the parallels drawn in my research is to minimize the generation of candidate sets to problems (or to completely avoid them), and some of the best algorithms are the ones which are able to utilize the computing power, deliver maximum maximum performance without delivering too much overhead to the system. The question of an algorithm being implementation (or problem-dependent) is also an interesting aspect, and as such many algorithms must be able to generalize over a wider range of datasets. The paper is interesting as it gives an overview of how problems can be modeled as standard equations(or problems) in linear algebra and can solved through the use of libraries.

The second paper talks about the graph algorithms pattern. I can relate to this because I encounter this frequently when solving graph-mining problems. The problem of modeling a data mining problem into a graph-mining problem utilizing graph algorithms is interesting and one of the useful aspects(mentioned in the writeup) is determining theoretical bounds on the performance of different algorithms. For example, one interesting problem which I thought of (on reading the various examples mentioned) is determining dense clusters of friends from a social network graph. I also feel that ignoring the special properties of the graphs(in the context of the problem that we are studying) does not help us develop efficient algorithms (which take advantage of those special properties). The writeup gives a nice overview of recognizing different types of graph structures, (and I think I will refer back to this page when I want to refresh concepts in graph algorithms :)). The last paper is about Monte Carlo patterns. I first came across these patterns as an undergraduate student when I was reading papers in simulated annealing and in particle swarm optimization. I feel that Monte Carlo methods are very useful is solving problems where no known optimal solution exists.

Tuesday, October 20, 2009

Review of Map-Reduce

This writeup gives an overview of the Map-Reduce pattern and its various uses. It gives a brief overview of the two phases involved in Map-Reduce - The Computation and the Reduction phase.
The issues occurring while distributing the smaller computations to the worker nodes as well as the inherent problems involved in reduction are explained clearly. The MapReduce concept is easy to understand from the logical viewpoint of pairs, where the input is taken in the form of data in a domain and the output is in a different domain. Other concepts of MapReduce such as the distribution of tasks and reliability of results are equally important, and even though they are not mentioned in this writeup, they are worth mentioning. Of great importance is seeing the MapReduce implementation of the PageRank algorithm. Earlier, during the initial stages of MapReduce, there was a certain amount of backlash from the database community because it promoted concepts like Brute Force instead of indexing and does not rely on many of the DBMS tools that people have used over the years.

Thursday, October 15, 2009

Review of Chess

This paper presents a tool for finding Heisenbugs in concurrent programs. A systematic exploration of program behavior is one of the important features of Chess enabling it to find bugs in a short time. One of the first things that strikes me about this paper is the fact that the authors have described two real bugs that have been found using Chess, and also how they were fixed by the tool. Another smart aspect in the tool is the control over thread execution. Even though the choice of algorithms for the search phase is a challenging problem, I think Chess handles this phase well.

Choosing to replay a given concurrent execution in an appropriate manner according to the tool is a goal which has been described very well in this paper. By providing some kind of a
'restart' through its cleanup functions, Chess ensures that every run of the program is handled well. There is also a subtle focus towards leaving some choices to the user especially in terms of delivering deterministic inputs, and is one of the interesting aspects of this work.

Tuesday, October 13, 2009

BA Chapter 14

This chapter gives a rather nostalgic overview of the basic architectural ideas behind the design of Smalltalk. The question of inheritance and its good/bad uses is addressed initially, and its support comes from the description of Smalltalk itself. Earlier, I came across a post describing how Smalltalk might make its comeback. The author in that post says that Smalltalk was ahead of its time, and now that OO concepts are firmly implanted in the heads of every beginning programmer, Smalltalk might be an option to consider. Many of the benefits espoused by languages such as Python, Ruby etc have been around earlier in Smalltalk and I think many people (who are unfamiliar) overloook this aspect. People who like concepts such as metaprogramming, dynamic typing etc will find that Smalltalk is similar to their needs. The regrowth in popularity of Smalltalk might be just a temporary fad as much of the earlier "problems" have been solved by languages such as Java with the growth and advancement of hardware and by software which could deliver "executables". With the recent growth in web applications, I think Smalltalk frameworks such as Seaside should become popular as well.

Wednesday, October 7, 2009

Review of ReLooper

This paper describes the ways in which an eclipse based tool called as ReLooper helps programmers help parallelize their programs. It makes use of the ParallelArray data structure for arrays. This paper gives an emphasis on determining when parallelizing a program would be unsafe (thread-unsafe). A good deal of user interactivity is, I think, one of the important features of ReLooper. The study evaluation for determining whether the refactored programs have conflicting memory access is useful. From the research evaluation it seems clear that reporting all possible race conditions (or not having false negatives) is one among the several benefits of this work. There also seems to be a good deal of speedup achieved (for both 1 and 2-core) as compared to the original code. The methodology of evaluation, by no means comprehensive, look sufficient to show the inherent advantages of using ReLooper. I tend to disagree with some of the evaluation results on the time taken (especially for the machine learning algorithms). It is hard to compare in the absence of other techniques (for comparison purposes). However with all the inherent advantages of ReLooper, I do not see a lot of practical applications for this as many real-life programs are much more complicated, and it remains to be seen how would research progress from this stage to the point where many different types of code could be refactored in a similar manner.

Tuesday, October 6, 2009

Software Architecture: OO Vs Func.

This chapter provides an interesting study of the benefits/disadvantages of functional programming vs object oriented design. Before delving into the details presented in this paper, I would like to talk about a similar article that I came across a few days back. This article provided motivation to take up programming in Haskell for a programmer who is used to programming in OO languages like C++, Java, C# and so on. The author makes the case for Haskell by saying that functional programming provides immutable objects, higher order functions, inclusional polymorphism and the fact that functional programmers typically are concerned with the question about how is data constructed rather what to do with the data (which is the case for OO programmers). Another article on similar lines describes the case for Haskell programming language.

One of the things that I would like to point out on reading the chapter is that the same metrics cannot be used for comparing the two different programming styles. The argument of considering better modularity of code(as a criteria for comparing programming styles) is valid especially while developing large software systems. Sticking together of these functions by "glues" is an interesting aspect, and even though the specifics are clearly mentioned, the details should be a little more subjective.

Refactoring Sequential Java Code for Concurrency via Concurrent Libraries

This paper presents a way for restructuring sequential code into parallel code using concurrent utilities. It makes use of the java.util.concurrent framework in Java5 and the ForkJoinTask Framework in Java7. I liked the argument that programming with locks in error-prone. Though I believe this is a contentious topic, I tend to agree with the authors' comments on this. The frameworks mentioned in the paper have addressed the research issues of usefulness, of making the existing code thread-safe, and finally of efficiency. The evaluation methodology used by concurrencer is pretty much comprehensive, and I doubt that many questions could be raised about this.

In the implementation, I liked the concept of ConcurrentHashMap , which avoids locking of the entire map.The number of ways in which refactoring is supported is pretty impressive and allows for parallelizing of different kinds of sequential code patterns. Another important aspect of this paper is that they have accounted for human intervention while retrofitting parallelism into sequential code. In this way, I believe they have accounted for future changes in code (especially if drastic changes have to be made in the code). It is also interesting to note that among the three conversions studied in this paper, only ConcurrentHashMap requires a degree of human intervention.

When the bazaar sets out to build cathedrals

In this chapter, the authors describe the inter-relationships between ThreadWeaver/Akonadi with the KDE project, and the development process in large scale open-source projects. One of the first things that struck me before the authors described these two projects in KDE was the way in which the focus is on developing and maintaining quality code within open-source projects. The authors have hit the nail in the coffin when they allude to the fact that open-source code is mostly of a higher quality than proprietary code. This fact can also be seen from the huge success of community open-source development programs like the Google Summer of Code. The motivation of just "reaching the finish line" rather than money or fame seems to work really well.

The Akonadi project by itself has several benefits, and the one of the interesting things that I noticed in its design was the focus on maintaining the stability of the overall system by way of providing separate processes for components requiring access to a certain kind of storage backend. Linking together of components with third-party libraries without compromising the stability of the overall system is also one aspect which seemed advantageous in the initial design of the Akonadi architecture. The authors have mentioned a series of code optimizations which seemed potentially useful, and even though I have not done much research into whether these have been implemented, I am interested in knowing the outcome of this. Does anyone know more about these optimizations?

Tuesday, September 29, 2009

Java Fork/Join Framework

In this paper, the authors demonstrate the feasibility of developing a pure Java implementation of a scalable, parallel processing framework. It is expected that JDK 7 will include this framework in its functionality. Fork/ Join algorithms are similar to divide-conquer algorithms and include a series of steps where a goal is broken down into a smaller set of components and the individual components are computed upon and finally the results are merged.

In the experimental results mentioned in the paper regarding the performance of Java Fork/Join versus other similar techniques in other techniques yield some observations which are a bit unclear to me. For example, the results about relatively similar speedups in performance inspite of increase in number of threads. Even though application-specific reasons are mentioned by the authors, it is likely that there are other fundamental reasons for this.

when This framework was announced for JDK 7, there was a huge deal of excitement in the developer community. I am not sure if the framework still commands as much enthusiasm as it earlier especially now that the JDK early access downloads are out. Is the framework incorporated in this release? Can someone check this?

GNU/Emacs

Jim Blandy explains Emacs in a way which makes one feel as if we are provided with a multipurpose knife, given the range of components and functionality present. Although I am never much of an Emacs user (I use vi instead), I understand the significance of the growth and development of Emacs over the years. The development of Emacs has (as the authors have mentioned) has taken a development process similar to that of an Operating System, and this has been very beneficial. With all the latest development tools around, and with the tremendous number of features provided by them, Emacs still remains a popular choice amongst programmers.

Maintainability of a text editor-kind of package is sometimes termed as "editability" and this article shows the apparent advantages of Emacs in this area. I have personally preferred Eclipse as a programming IDE (simply because I started learning it very early). However, with all the benefits provided by Emacs(and the latest release which looks very cool), I plan to switch allegiance very soon.

Our Pattern Language

One of the most prominent aspects of this paper that stands out is the stacked arrangement (in layers) of the language itself. It also gives a reasonably descriptive overview of each aspect of the language (and each layer), and I think serves as a primer on the development of similar languages. By breaking down the system into five different layers, it provides a way for readers to understand the language better. it will be interesting to see the applicability of existing parallel programming software systems to the patterns specified by this language.

The authors have clearly defined the roles played by different kinds of programmers, and from my own personal experience I find this to be a very accurate representation. Even though I am only now being to work in parallel software systems I can relate to the roles played by parallel software developers especially about the handling of concurrency issues and also the load balancing(between processors) aspect of parallel programming. I feel that application development(on existing software systems) is one of the main reasons for the growth of parallel programming and the authors have given enough focus to this aspect in the paper. Overall OPL caters to the needs of different types of programmers (not restricted to parallel programming), and also accounts for system changes thus making this a somewhat flexible pattern language.

Metacircular virtual machines

This chapter is I believe, the most detailed of the the other virtual machine implementations that we have seen before in the course. More now than before, I am able to appreciate the practical benefits of using Java, and its features like modularity, the availability of several user-contributed libraries and the supporting tools. Reading Jikes which is a virtual machine written in Java and hosted in Java's own runtime environment was an eye-opener in itself and is quite different from the other VMs we have encountered earlier in class. One of the interesting features of Jikes which I liked is the extensibility aspect. Among the several other benefits of Jikes which are mentioned in passing, the one which caught my eye was the creation of an entire operating system based on the Jikes virtual machine. I am not sure how feasible this is, and would love to know more about this. Also, more than just an academic experiment, there do not seem to be any other advancements in Jikes (If I am not wrong).

I also came across this paper which describes the use of debuggers for meta-circular VMs. Since Jikes is written in Java and is platform-independent, the authors argue that Hybrid-debuggers which make use of high-level platform-independent details along with access to platform-specific code would be the ideal debuggers for such VMs. Overally this paper was a nice read, though we could have avoided reading more of the implementation specifics and focused on the architecture as a whole.

Wednesday, September 23, 2009

JPC

Up until the development of JPC, there was no pure Java emulators for the PC available in the market. The feat achieved by the developers of JPC is incredible, considering the amazing number of changes the x86 PC has undergone over the past several years. I find the speed achieved by JPC to be all the more remarkable considering the fact they have had to incorporate so many optimizations. The fact that the JPC runs at such a fantastic speed in spite of having to boot up over a 16-bit real mode of an operating system is simply outstanding. I am not sure if the capability to boot up a fully graphical linux would be supported by JPC in the coming months. I'd sure be amongst the first to test it when it does get released.

JPC makes an ideal platform for doing security research considering the number of security features that are built in. Except anti-virus experts to come up with solutions for more and more viruses, worms and other malicious attacks to a PC without the fear of losing their own machines. With the cross-platform capability built in (by virtue of running in the Java sandbox), I think JPC should become very very popular very very soon. I haven't actually checked out the built-in DOS games that are available along with the emulator. Maybe I should do that pretty soon. :)

The Adaptive Object-Model Architectural Style

AOM provides a novel way to develop dynamically configurable applications and for the development of software systems which emphasize flexibility and adaptability to changing requirements. Inspite of the several advantages foreseen for such an architecture, there is also the question of being able to understand so many layers of abstraction which I believe is a disadvantage. I studied some of the related techniques to AOM such as agile development, meta-modeling etc, and I find AOM to be a pretty solid architecture style.

The concept of all properties being stored in the same way in the AOM as compared to more traditional styles which have properties indexed in files etc, is a question asked to Joseph Yoder in an interview, and I could not catch on how the performance of AOM was similar to that of the styles mentioned by the interviewer. Maybe someone could clarify this? In the same interview, there is this interesting discussion on managing the different versions of the object model.

On a more tangential note, there was this blog by Jim Alateras, which talks about several practical applications of AOM, which I didnot know of earlier. He mentions about openEHR which develops open source software in the domain of clinical implementation, healthcare education and so on. Do check it out.

Tuesday, September 22, 2009

Guardian: a fault-tolerant operating system

This chapter gives a very nostalgic trip down memory lane, and has much more of a hardware-themed contents than the earlier papers that we have come across so far. It is interesting to note how Tandem handled the limitations of address space. I am not clear as to why later versions of Tandem considered duplication of memory as an option. Also now that we are on the subject of building reliable systems, it seems to me that all kinds of inefficient solutions to providing component dependability such as freezing a CPU in cases of failure etc, had been under consideration. In a further paper by Lee et al, the causes of software failure in Guardian are analysed in detail. The authors found out that 77% of software failures are caused by software problems themselves. It looks to me that the single failure tolerance of the guardian system is not actually beneficial. Another study found that memory management is the main source of software problems in Guardian. Guardian seemed to have performed better in terms of number of faults, as compared to the pre-existing machines of that time, mainly IBM and VAX. Given the detail to which this chapter analyses the strengths and weaknesses in the design of Guardian, I think it should still serve as a case in-study(inspite of the era in which it was developed) in the design of similar systems now.

Big Ball of Mud

One of the major themes in this paper is the fact that a big ball of mud pattern may not always be detrimental, atleast during the initial stages of the development of a software. For a variety of reasons, such as lack of time or resources, this may indeed be the best way to go, and is prevalent in many software designs. I like the "teachable moment" kind of advice that is given in the paper that these systems must be studied in detail to further understand what they accomplish, and maybe learn lessons, and incorporate the changes in the design of new systems. Technology shifts, for this reason, are good drivers of change for such systems.

As a student programmer with not much of an experience in developing really large software systems, I always find it comforting to know that the system that I am developing my code on is well documented, clean (especially if I am starting work on the system for the first time), rather than being in the presence of a big ball of mud. In this context, I read some of the classic design mistakes posted by Steve McConnell, and I could put some perspective into what the authors are saying in this paper. Among the forces that the authors claim to drive the development of the big ball of mud, I find the cost factor a bit confusing. The question of a quick-dirty project Vs a a well planned-expensive-design one is something which I can never really resolve unless I am involved in a larger more-at-stake kind of work environment. I feel that with small companies the idea of quick-market-entry, quit-exit strategy seems more appealing to cut costs, stay afloat and maybe even rake in some initial profit. In that scenario, this idea is not too bad.

Thursday, September 17, 2009

Xen and the Beauty of Virtualization

Xen is an open source software project that provides high-performance virtualization. It allows multiple operating systems to run concurrently on a single physical computer. So using Xen, we can run different opreating systems at the same time on a single machine. Part of the reasons for the popularity of Xen are the number of different operating systems which are supported and the wide variety of applications that can be run on it. Some of the benefits of Xen which I feel are important are the application mobility support, the security and privacy features which are supported and so on. One of the most useful features in version 3.0 is the support for hardware virtual machines and the support for reusing other open source projects like emulated BIOS and so on.

I am not much of a virtualization groupie, but I have used VMWare before, and it was very useful for me when I wanted to switch between using windows-linux applications and I didn't want to run Cygwin in my lab machine. I haven't tried Xen but I will do so, now that I am very impressed by the inherent advantages that it possesses like the architecture built on the conept of distrust and so on. The second point is that Xen has benefited widely from its use of the Linux operating system (especially in its early days),and with the support from major players in the market, it has only grown in stature.

Layered Architectures

I like the description of the layered architecture provided in this chapter, and especially the parallels drawn with networking protocol stacks. Amongst the benefits provided by this type of architecture, I feel that the most important one is the ability to shield higher layers from the changes in lower layers provided that only the lowest levels, typically the hardware are subject to change. This type of architecture is more suitable for a stable networking system. As Steven says, this design pattern provides a way to decompose large systems that provide collaborating objects, and for consolidating segmented interfaces.
Reusing of functionality is also one of the benefits which I feel is important and representative of this architecture. However, the choice of implementing this architectural pattern is entirely upto the design team. This can be easily explained to them and it is also easy to demonstrate how each layer fits into the overall picture. In theory all of this seems very good, but it would require some efforts on part of the organization to get it right the first time. To quote Eric Evans, "If an unsophisticated team with a simple project decides to try a model-driven design with layered architecture, it will face a difficult learning curve". In a related paper, a study of the usage of layered architectures in the industry was conducted and they concluded that any number of layering diagrams were possible for a particular software architecture. The authors argue that none of the architectures under study made use of multiple layering criteria in their design. Overall, I think the status and meaning of the word "layered architecture" has undergone a lot of changes since the days of the OSI - 7 layer model and is evolving rapidly today.

Tuesday, September 15, 2009

Data Grows Up: Facebook

In context of this chapter, I was reading an article by Facebook VP of Technical Operations Jonathan Heiliger on how to manage the growth of users/data while rolling out features on a regular basis. He mentions that for any organization to be successful it needs to be embrace change rapidly. He also mentions that unlike other organizations, Facebook does not have QA. They simply have a "cradle-to-grave" lifecycle of their code. Coming back to the chapter, I would like to comment on the application platform and the churn cycle of applications developed for Facebook. In a recent presentation, it was mentioned that nine of the top fifteen applications for Facebook are new. But on the upside, there is a decline in the churn trends for the top few spots due to Facebook's efforts to control application spam.

Statistically, the Facebook API has been extremely popular, with nearly 12000 applications produced since the launch of the API. I think that the main reasons for its success are its openness (in deciding what customers want), its targeted audience, the lure of potential riches to developers whi can decide how to make their application into a good business model without much interference, by having a lot of features which drive "viral" growth and so on. Inspite of the privacy concerns (which are anyway present in most social networks), the API has proven itself to be a strong driver to the growth of Facebook over the last year. Among the several features in Facebook mentioned in this chapter, I would like to single out FQL for the advantages it possesses: reducing response size, providing a common syntax for all methods and for condensing Facebook queries.

Thursday, September 10, 2009

Resource-Oriented Architectures

This chapter discusses about resource oriented architectures as an effective way to wrap reused code, services and so on with named interfaces which help prevent the leaking of implementation details. Among several new innovations proposed in the REST style were the way requests were processed. Earlier requests were processed by putting into a structured package where it can forwarded and processed by participants in a workflow. This was like a contractual obligation which contrasts sharply with the REST style. One important aspect of the resource oriented architectures is their focus on states of a data model. Among the several features mentioned, I think the idea of logically separating the concerns such as the nouns, verbs and representation appealed to me a lot. In addition to the characteristics mentioned, aspects like keeping multiple copies of the resource data, using publish/subscribe messaging systems and so on should be mentioned deserves more attention as well. Among other benefits of using ROA, there are several other things which I liked, and the major ones being the ability to extend by accounting for new representations and URI, the reliance on widespread web standards and so on. While browsing for other resources, I came across this interesting presentation which described a board game (Chess) remade using ROA. I do not think this is novel, but the presentation is excellent. D0 check it out.

Tuesday, September 8, 2009

ArchJava

This paper describes ArchJava, which is a backwards compatible extension to Java code used for integrating software architecture implementations into Java implementation code. Communication integrity and seamless integration of architecture and implementation are explained to be the main contributions of this work. I feel that one of the most important aspects of this work is the ability to ensure that both architecture and code are kept consistent as they evolve. A reasonably good attempt at program understanding has been made in this paper even with all the limitations specified.

The paper considers the length of code to be a good indicator of complexity of the underlying program and even though there could be other factors as well, the methodology adopted to evaluate the effectiveness of ArchJava is sound. As I am unfamiliar with this topic, I cannot comment on whether ArchJava has been successfully implemented on larger pieces of code. Among the advantages, I feel that the most important ones are the ability to explicitly list the method call communication between components which makes it complete. On the other end of the scale, even though the task of rewriting a program to make use of ArchJava is not shown to be complicated for the system that they have considered, it cannot be generalized everywhere. I feel that rather than working across a wide variety of programs across the board, it is too restrictive and focuses on projects who adhere to a standard set of specifications.

Making memories

Before anything else I would like to say that I was lost in the way the discussion segued from a discussion about the workflow of LPS to the author's frustation with the directory structure of different deployment scenarios. Also in the same vein, it is not clear why the development has to be tied down to agile software development and the reasons for it are not clear. Even though I am not an expert with Spring, there is a general consensus that Spring is useful when patterns are present. If a particular problem fits a pattern then we could use it otherwise it is not advisable to look for ways to apply all possible patterns. To comment on the overall theme of this chapter, I like the way the entire description has unfolded from a discussion about the fundamental forces governing the creation center architecture like the business and its context.

I also like the philosophy of "Fail Fast, Fail Loudly" that has been applied in the render engine. In the UI design I think that the policy of separating the visual appearance of a screen from the logical manipulation of its properties was a good decision in its implementation. In addition the ability of forms to capture not only type-value relationships but also metadata is an important factor contributing to the clean model of the user interface. I also agree with the author's assessment that the property-binding architecture was one of the significant aspects of the architecture that was developed in the work. Finally, I believe that the selling point of the experience involved was the ability to "sell" the software to studio executives with much of the underlying complexity or "plumbing", as the author puts it, removed.

4+1 Views

This is a unique paper which presents a way to describe the architecture of software-intensive systems. One of the important aspects of this work is its implementation of several large projects, and the ability of various stakeholders such as system architects to look at it from different perspectives. The focus on addressing large implementations of projects where a single blueprint may not suffice is a selling point of this paper. The seminal nature of this work and its implementation over a long period of time makes me dig out practically no demerits of this work.

I also felt that example for the logical view in Figure 3a showing the logical blueprint of Telic PABX was a bit unclear. Furthermore, the process view seems to be very well suited to take into account the non functional requirements etc, which are an important aspect of software intensive systems. A systematic understanding of the system is obtained by the intentional suppressing and displaying of specific information in each view. Practical aspects in the development of software intensive systems are also taken into account using the collaboration diagrams and overall I really have nothing but a deep sense of admiration and praise for this work.

Architecting for Scale

This chapter of the book describes the experiences in developing Project DarkStar, as a response to the changing needs and requirements of massively multiplayer games in terms of scaling and performance. I like the way a paradigm shift in programming for standalone PCs to a large number of servers and clients has been explained in a retrospective kind of way. Among the various services provided in the DarkStar protocol stack, I liked the way communication services has been designed and developed. By not revealing the actual endpoints of the communication between client and the server (A practice common in many peer-to-peer systems), the issues of scalability and changing the system components are handled well.

The Data Service has also been defined well, and the requirements of the Data Store though different from that of a regular database are clearly defined. Moving on to the question of testing the performance of such a system, I think this paper
"Marios Assiotis, Velin Tzanov: A distributed architecture for MMORPG. Netgames '06: page 4" describes one method. Overall, the paper is a good example of a system which allows the development of distributed, threaded games while providing a simplistic programming model.