Plat_Forms 2012 rev. 2 announcement
"Plat_Forms" is a competition in which top-class teams of three programmers compete to implement the same requirements for a web-based system within two days, each team using a different technology platform (say, Java EE, .NET, PHP, Perl, Python or Ruby). The results will provide new insights into the real (rather than purported) pros, cons, and emergent properties of each platform. The evaluation will analyze many aspects of each solution, both external (usability, functionality, reliability, performance, etc.) and internal (structure, understandability, flexibility, etc.).
- How the contest will proceed
- Do we really need another contest?
- How to apply
- The task
- Rules of behavior
- Semi-public review
- Results hand-over
- Explaining what you did
- Evaluation and "winning"
- What's in for the teams and their home organizations?
Software development platforms for web applications (such as Java EE, .NET, PHP, Perl, Python, Ruby, etc.) are a critical factor of development productivity today. The pros and cons of the various platforms are by-and-large known in principle, but how the pros trade off against the cons in one platform and how that compares to another platform is the topic of quasi-religious wars only, not a subject of objective analysis, as almost no data is available that allows such direct comparison. Also, the importance of the types of web application development frameworks has increased in recent years, making the programming language as the sole platform discriminator inappropriate.
Plat_Forms is a contest that will change this. It will have top-class (and hence comparable) teams of 3 programmers implement the same specification of a web-based application under the same circumstances and thus generate a basis for objective comparison of the various characteristics that the platforms generate.
In just 2 days, the teams will implement as much of the requested functionality as they can and at the same time optimize the usefulness of the resulting system (functionality, usability, reliability, etc.), the understandability of the code, the modifiability of the system design, the efficiency and scalability.
The contest will be conducted on October, 9-10, 2012. At the end of the 2 days, the teams hand over their source code and a turnkey-runnable VMware configuration of their system.
These items will then be subject to a thorough evaluation according to scientific standards with respect to the criteria mentioned above. As most of the results cannot be quantified and many cannot even be ranked in a neutral fashion, there will be no one-dimensional result ranking of the systems. Rather, there will be an extensive report describing the findings. Depending on the results, the organizers may or may not declare one or a few of the systems and teams winners with respect to one particular criterion (and do so for some or all of the criteria).
Fri 2012-09-07Fri 2012-09-14: Teams apply for participation in the contest as described under "How to apply" below
- Fri 2012-09-14: Teams are notified whether they will be admitted to the contest. At most four teams per platform will be admitted and at most six platforms; up to 24 teams overall.
- Mon 2012-10-08: Teams set up their development environments at the contest site. For details, see "Infrastructure" below.
- 2012-10-09, 9:00: The contest starts. The organizers will explain (in a presentation format) the requirements of the system to be developed, will hand out a short document containing a few more details, and will answer any immediate questions that may arise. For details, see "The task" below.
- 2012-10-09, 10:00: The teams start developing the software using their favorite platform and tools. Reusing existing software and reading the Web is allowed, getting external help is not. For details, see "Rules of behavior" below. The teams are asked to make intermediate versions accessible for user feedback, see "Semi-public preview" below
- 2012-10-10, 18:00: The teams stop developing software and hand over their result in the form of a VMware image of a server machine. For details, see "Results hand-over" below. Teams that believe they have reached the best cost-benefit ratio of their development before the allotted time is over are allowed to hand-over their results earlier and will have a shorter work time recorded.
- 2012-10-12, 16:00 (that is, 15:00 UTC): The teams submit post-hoc design documentation. For details, see "Explaining what you did" below.
- 2012-10-11: Evaluation of the systems starts. It will investigate all categories of quality criteria, both internal and external. For details, see "Evaluation and winning"
- 2012-06: Results of the contest will be presented. The details of when, where, and how are still to be determined. See "What's in for the teams and their organizations?" for what is known already.
Absolutely. But it is not "another", it is the only one of its kind.
Every year, several hundred million dollars are spent for building the type of application mentioned above, yet nobody can be quite sure in which cases which platform or technology is the best choice. Quasi-religious wars prevail.
Some platforms are often claimed to yield better performance than others, but nobody can be quite sure how big the difference actually is.
Some platforms are often claimed to yield higher productivity in initial development than others, but nobody can be quite sure how big the difference actually is.
Some platforms are often claimed to yield better modifiability during maintenance than others, but nobody can be quite sure how big the difference actually is -- or if it really exists at all.
So as a program manager one can almost consider oneself lucky if a company standard prescribes one platform (or if expertise is available only for one) so that the difficult choice needs not be made.
However, that means that many (if not most) projects may use a sub-optimal platform -- which sounds hardly acceptable for an industry that claims to be based on hard knowledge and provable facts.
What we need is a direct comparison of the platforms under realistic constraints: a task that is not trivial, constrained development time, and the need to balance all of the various quality attributes in a sensible way.
So if such a comparison is so important, why is nobody else doing it?
Because it is difficult. To do it, you need:
- participant teams rather than individuals, or else the setting will not be realistic;
- top-class participants, or else you will compare them rather than the platforms;
- a development task that is reasonably typical, or else the result may not generalize;
- a development task that is not too typical or else you will merely measure who of the participants happened to have a well-fitting previous implementation at hand;
- participant teams that take the challenge of implementing something on the spot that they do not know in advance;
- the infrastructure and staff to accommodate and supervise a significant number of such teams at once;
- an evaluation team that is capable of handling a heterogeneous set of technologies (which is a nightmare);
- an evaluation team that dares comparing these fairly different technologies in a sensible yet neutral way;
For these reasons, all previous platform comparisons were very restricted. Several, such as the c't Database Contest or the SPEC WEB2005, concentrate on one quality dimension only (typically performance) and also provide participants with unlimited time for preparing their submission. Others, such as the language comparison study by Prechelt, are broader in what they look at and may even consider development time, but use tasks too small to be relevant.
In 2007 we carried out a first instance of the Plat_Forms contest which was a huge success. Details on the results, including the complete technical report, can be looked up on the Plat_Forms 2007 results page. The 2011 execution of Plat_Forms went from three platforms to six, added a number of evaluation dimensions, and otherwise strove to corroborate the previous findings. As the technology landscape is moving all the time, some findings also changed, especially the notion of what constitutes a platform. See the Plat_Forms 2011 results page for details.
In 2012 we want to focus on the complete development lifecycle of a small web application. Plat_Forms 2012 will not only be about finishing as much functionality as possible in the given time frame but also be about good software craftmanship, quality assurance and proper deployment. In our analysis we will also group platforms not only by programming language used but rather by the similarity of the technology stack employed.
At most four teams per platform will be admitted to the contest. It is not the purpose of the contest to compare the competence of the teams; we will therefore strive to get the best possible teams for each platform to make it more likely that significant differences observed in the final systems can be attributed to the technology rather than the people.
Teams interested in participating please apply by filling out our Application Form. Multiple teams from the same organization are allowed if (and only if) they apply for different platforms.
Teams must agree that the result of their work (but not frameworks etc. that they bring along) will be released under the GNU General Public Licence V.2 (GPL), the Apache License 2.0 (AL) or the modified BSD license.
From among the applications, teams will be selected so as to maximize their expected performance. So please include in your application information that is useful for us for judging what performance we can expect from you -- but do not exaggerate or you may publicly embarrass yourself. The selection process is performed with the help of a contest committee for which we will invite an expert representative from each platform.
The following information is preliminary. We will provide an update with possible modifications of some details two weeks before the contest.
At the contest, each team will be provided with roughly the following infrastructure:
- Electrical energy (235 V, 5 A max., separately fused, german-style Schuko socket)
- chairs and tables
- an internet connection, via an RJ45 connector serving Ethernet. The available bandwidth is not yet known (we hope for 10 Mbit/s overall), bandwidth management is probably on a best-effort basis.
- sufficient food and drink
All teams will work in one single large room or two large rooms.
Things you need to bring yourself:
- computers, monitors, keyboards, mice,
- an optional server computer
- must be operable in stand-alone mode, without network connection,
- must host at least the turnkey configuration of your final system under VMware
- alternatively we can provide you with a virtual machine in our existing infrastructure
- some medium (USB stick, DVD+-R, etc.) on which to hand over your results (see Results hand-over below)
- network cables, network switch/hub,
- printer, printer paper,
- pens, markers, scissors, adhesive tape,
- perhaps desk lamp, pillow, inflatable armchair etc.
- coffee mug,
- backup coffee mug.
We will obviously not tell you right now what the development task will be. However, here are some considerations that guide our choice of task:
- It will be a web-based application with both a browser-based front end and (probably) a simple RESTful web service interface. The browser interface must be compatible with all major browsers (IE, Firefox, Opera, Safari). We will give you HTML page prototypes to start from to save you some time and standardize the solutions look-and-feel somewhat.
- It will require persistent storage of data.
- It may require integration with external systems or data sources, but using simple and standard kinds of mechanisms only (such as HTTP/REST or so).
- It will neither be a highly standard type of application (say, a web shop or static-content management) nor something entirely exotic and unprecedented. The task will be chosen such as to make reuse of large portions of existing systems unlikely, but reuse of smaller pieces possible.
- In order to allow for a broad assessment of a platform's characteristics, the task will not be a create-read-update-delete database application only, but will involve other aspects as well. Such things might be for example algorithmic processing, data-driven graphics, audio handling, etc. The complexity of these requirements will be modest, so that they can be solved without specialist knowledge.
In your solution you should strive for a good balance of all quality attributes, including usability and robustness.
In addition to previous installments of Plat_Forms, we want to focus on the complete development lifecycle of a small web application. Plat_Forms 2012 will not only be about finishing as much functionality as possible in the given time frame but also be about good software craftmanship, quality assurance and proper deployment. All that will be taken into account when analysing the solutions.
In our analysis we will also group platforms not only by programming language used but rather by the similarity of the technology stack employed. Our results have shown that in some cases the simliarity of two different web frameworks plays a bigger role than the programming language used.
During the contest you may:
- Use any language, tool, middleware, library, framework, and other software you find helpful (just please mention as many of these as you can foresee in your application).
- Reuse any piece of any pre-existing application or any other helpful information you have yourself or can find on the web yourself. Anything that already existed the day before the contest started is acceptable.
- Use any development process you deem useful.
- Ask the organizer (who is acting like a customer) any question you like regarding the requirements and priorities.
During the contest you may not:
- Disturb other teams in their work.
- Send contest-related email to people not on your team or transfer the requirements description (or parts thereof) to people not on your team.
- Have people from outside of your team help you or "reuse" work products from other teams. There are two exceptions to this rule: you may use answers of the customer and user-level preview feedback as described below.
During the contest, teams will be able to obtain feedback from the internet public if they wish to do so. For this purpose, the team should open their test system on their VMware team server for public access.
The organizers will put up a blog where the teams can announce their release plan (if any), releases, and access URLs, and where anybody can comment on the prototype systems regarding functionality, defects, usability etc. The teams are allowed to use this user-level feedback for improving their system. They are not allowed to take or use code-level information.
The technology used for building the systems in the contest will be very heterogeneous. It would therefore be impractical for the contest organizers to try to execute them from source code alone, not to speak of obtaining similar behavior in a performance test.
We thus require each team to deploy their solution as follows:
- It absolutely must be a virtual machine that can be imported into VMWare vSphere without conversion (see www.vmware.com). If you want to, we can provide you with a virtual machine in our existing infrastructure.
- The virtual network card should be of Intel E1000 type.
- The virtual machine must be configured to use DHCP for acquiring its IP address
- Its virtual hard disk must be set to grow when required (thin provisioning, as opposed to acquiring the full virtual disk space upon creation) and must not exceed 30GB.
- We also need the username and password of the administrative user for your machine, as well as the type (including the distribution name when running Linux) and version of the operating system the virtual machine is running.
The image file of this virtual server will be handed over at the end of the contest by means of some medium, preferably a single DVD-R (a USB stick with a FAT file sytem may be fine, too) created by the respective team themselves. In the case of a DVD-R this means the image must be smaller than 4.7 GB, which should easily be possible, because the virtual server does not need to have any application software installed beyond your contest solution and the infrastructure software that it uses.
Beyond the image file, the medium needs to contain a second file that is an archive (zip or tar.gz) containing a snapshot of all source artifacts (source code, build files, database initialization scripts, configuration files, CSS files, etc.) that are part of the solution. The contents of this archive must be sufficient in principle to recreate your solution from scratch, given the infrastructure software (such as operating system, build tools, DBMS, application server etc.) Furthermore, a third file must contain your whole source code version archive so the organizers can analyze some aspects of the development process.
At the time of the result handover, the teams will also send a cryptographic fingerprint of the image file and of the archive files to the organizers by email, so that a replacement medium can be accepted should the original medium fail to be readable (please keep your original image file around!).
Both source code and build/configuration/deployment of your system are fixed at server hand-over time. However you will be able to prepare and submit a document afterwards that shortly explains the following points:
- the architecture of your system
- your approach to development (priorities, implementation orders etc.)
- the rationale of each important design decision you have identified
This document will be an important contribution towards a fair and thorough evaluation of your system, because without it the evaluation team will have a hard time judging many of the things it will get to see.
It may be possible that we ask you to fill out this questionnaire during the contest.
We will attempt to evaluate all of the following aspects of the system and its development:
- External product characteristics: functionality, ease-of-use, resource usage, scalability, reliability, availability, security, robustness/error checking, etc.
- Internal product characteristics: structure, modularity, understandability, modifiability (against a number of fixed, pre-determined scenarios), etc.
- Development process characteristics: Progress over time, order and nature of priority decisions, techniques used, etc.
The details of this evaluation will be determined once we get to see the systems that you built. The evaluation will be performed by the research group of Professor Lutz Prechelt, Freie Universität Berlin.
We will not compare all systems in one single ranking by some silly universal grading scheme. Rather, we will describe and compare the systems according to each aspect individually and also analyze how the aspects appear to influence each other.
Therefore, there may be "winners" of the contest with respect to individual aspects (or small groups of related aspects) where we find salient differences between the platforms or the teams. However, there will not be a single overall winner of the contest.
So why should you participate in the contest if you cannot win it?
1. Category "riches and beauty": We will probably award (modest) monetary prices, just not across platforms. However, we will nominate a best solution among the three solutions on each individual platform.
2. Category "eternal fame": The detailed evaluation will provide the organizations of the well-performing teams and platforms with some of the most impressive marketing material one can think of: concrete, detailed, neutral, and believable.
- c't Database Contest. (German call for participation, English call for participation). Entscheidende Maßnahme: c't 13/06, (German description of results).
- SPEC WEB2005 benchmark.
- Lutz Prechelt. An empirical comparison of C, C++, Java, Perl, Python, Rexx, and Tcl for a search/string-processing program. Technical Report 2000-5, 34 pages, Universität Karlsruhe, Fakultät für Informatik, Germany, March 2000. (The detailed evaluation of the previous study mentioned above).
- Lutz Prechelt. An empirical comparison of seven programming languages. IEEE Computer 33(10):23-29, October 2000. (A short summary of ).