Room 101

A place to be (re)educated in Newspeak

Friday, November 28, 2014

A DSL with a View

In a previous post, I promised to explain how one might define UIs using an internal DSL. Using an internal DSL would allow us to capitalize on the full power of a general purpose programming language and avoid having to reinvent everything from if-statements to imports to inheritance.

Hopscotch, Newspeak's UI framework, employs such an internal DSL.  Hopscotch has been discussed in a paper and in a talk.  It will take more than one post to describe Hopscotch; here we will focus on its DSL, which is based on the notion of fragment combinators.

Fragments describe pieces of the UI; they may describe individual widgets, or views constructed from multiple pieces, each of which is in turn a fragment. A fragment combinator is then a method that produces a fragment, possibly from other fragments.

One of the simplest fragment combinators would be label:, which takes a string. The expression

label: 'Hello Brave New World' 

would be used to put up the string "Hello Brave New World" on the screen.  Other examples might be

button: 'Press me' action: [shrink]

which will display a button



that will call the method shrink when invoked.  The combinator button:action: takes two arguments - the first being a string that serves as the label of the button, and the second being a closure that defines the action taken when the button is pressed. Closures in Newspeak are delimited with square brackets, and need not provide a parameter list if no parameters are required.  This is the most lightweight syntax for literal functions you will find. Along with the method invocation syntax, where method names embed colons to indicate where arguments should be placed, this gives a very readable notation for many DSLs.

Further examples:

row: {
  button: 'Press me' action: [shrink]. 
  button: 'No, press me' action: [grow].
}

The row: combinator takes a tuple of fragments (tuples in Newspeak are delimited by curly braces, and their elements are separated by dots) as its argument and lays out the elements of the tuple horizontally:


the column: combinator is similar, except that it lays things out vertically

column: {
  button: 'Press me' action: [shrink]. 
  button: 'No, press me' action: [grow]
}

produces:



In mainstream syntax (Dart, in this case) the example could be written as

column(
  [button('Press me', ()=> shrink), 
   button('No, press me', () => grow)]
 )

The Newspeak syntax is remarkably readable though, and its advantage over mainstream notation becomes more pronounced as examples grow. Of course none of this works at all if your language doesn't support both closures and literal lists/arrays.

So far, this is very standard stuff, much like building a tree of views in most systems.  In most UI frameworks, we'd write something like

new Column([new Button('Press me', ()=> shrink), 
            new Button('No, press me', () => grow)]
    )

which is less readable and more verbose.  Since allocating an instance is more verbose than calling a method in most languages, the fact that fragment combinators are represented via methods, which act as factories for various kinds of views, helps make things more concise. It's a simple trick, but worth noting.

The advantage of thinking in terms of fragments becomes clearer once you consider  less obvious fragments such as draggable:subject:image:, which takes a fragment and allows it to be dragged and dropped. Along with the fragment, it takes a subject (what you might call a controller) and an image to use during the drag. Making drag-and-drop a combinator means everything is potentially draggable. Conventional designs would make this a special attribute of certain things only, losing the compositionality that combinators provide.

Presenters are a a specific kind of fragment that is especially important. Presenters provide user-defined views in the Model-View-Controller sense.  To define your own view, you subclass Presenter. Because presenters are fragments, any user defined view can be part of a predefined compound fragment like column: or draggable:subject:image:.

A presenter has a method definition which computes a fragment tree which is used to render the presenter. The fragment DSL we discussed is used inside of presenters. All the combinators are methods of Presenter, so they are inherited by any class implementing a view, and are therefore in scope inside a presenter class.  In particular, combinators are available in the definition method.

To see how all this works, imagine implementing the well known todoMVC example.  We'll define a subclass of Presenter called todoMVCPresenter  to represent the todoMVC UI.  The UI is supposed to present a list of todo items. It consists of a column with:


  1. A header in large text saying "todos" 
  2. An input  zone where new todos are added.
  3. A list of todos, which can be filtered using controls given in (4). 
  4. A footer, that is empty if there are no todos at all. It materializes as a set of controls once there are todos.

We can translate these requirements directly:

definition = (
     ^column: {
      (label: 'todos') hugeFont.
      inputZone.
      todoList.
      footer.
     }
)

More notes on syntax:  methods are defined by following their header with an equal sign and a body delimited by parentheses; ^ signifies return; method invocations that take no parameters list no arguments, e.g., inputZone, not inputZone(); chained method invocation does not require a dot - so it's 
(label: 'todos') hugeFont rather than label('todos').hugeFont.

We haven't yet specified what inputZone, todoList and footer do. They are all going to be defined as methods of todoMVCPresenter.  We can define the UI in such a top down fashion  because we are working with a language that supports procedural abstraction. You get it for free in an internal DSL.  

We can then define the pieces, such as

footer = (
 ^subject todos isEmpty 
 ifTrue: [nothing]
 ifFalse: [controls]
)

Here, we use conditionals to determine what view to produce, depending on the state of the application.  The application logic is embodied in the controller, subject, which we query for the todos list. The nothing combinator does exactly what it says; controls is a method we would have to define in todoMVCPresenter, detailing what should appear in the footer if it is visible.  Again, the code corresponds closely to the natural language description in bullet (4) above.

To elaborate todoList we'll need a loop or recursion or something of that nature; in fact, we'll use the higher order method collect:, which is Newspeak's version of map

todoList = (^list:[subject todos collect: [:todo | todo presenter]])

The list: combinator packages a list of fragments into a list view. We pass list: a closure that computes the list to todo items. 

Aside: We could have passed it the list itself, computed eagerly. Often, fragment combinators take either suitable fragment(s) a closure that would compute them.

To compute the fragment list we compute a presenter for each individual todo item by mapping over the original list of todos.
The closure we pass to collect: takes a single parameter, todo. Formal parameters to closures are introduced prefixed by a colon, and separated from the closure body by a vertical bar.

What are the odds that higher order functions (HOFs) were part of your external DSL? Even if they were, one would have to define a suite of useful  HOFs. One should factor the cost of defining useful libraries into any comparison of internal and external DSLs.

The Hopscotch DSL has other potential advantages. Because fragment combinators are methods, you can override them to customize their behavior.
We believe we can leverage this  to customize the appearance of things, a bit like CSS. To make this systematic, we expect to define whole groups of overrides in mixins. I'm not showing examples where Hopscotch is used this way because we have done very little in that space (and this post is already too long anyway). And we haven't spoken about the other advantages of Hopscotch.
such as its navigation model, lack of modality and very clean embodiment of MVC.

Ok, now it's time for the caveats.


  1. First and foremost,  Hopscotch currently lacks a good story for reactive binding.  In our example, that means you'd have to put explicit logic to refresh the display in some of the controls.  This makes things less declarative and harder to use. We always planned to solve that problem; I hope to address it in a later post.  But the high order bit is that we have code in a general purpose language that gives a very readable, declarative description of the UI. It corresponds directly to the natural language description of the requirements.
  2. Hopscotch lacks functionality in order to support richer UIs, but the design is naturally extensible: one adds more fragment combinators.
  3. We also want more ports, especially to mobile/touch platforms. However, Hopscotch has already proven quite portable: it runs on native Win32, on Squeak's Morphic and on HTML (the latter port is still partial, but that is just an issue of engineering resources).   More ports would help us deal with another controversial goal - defining a UI platform that works well across OS's and devices.


Regardless of the current engineering limitations, the point here is simply to show the advantages of a well-designed internal DSL for UI.  The lessons of Newspeak and Hopscotch apply to other languages and systems, albeit in an attenuated fashion.










Monday, September 29, 2014

A DOMain of Shadows

One of the advantages of an internal DSL  over an external one is that you can leverage the full power of a general purpose programming language. If you create an external DSL, you may need to reinvent a slew of mechanisms that a good general purpose language would have provided you: things like modularity, inheritance, control flow and procedural abstraction.

In practice, it is unlikely that the designer of the DSL has the resources or the expertise to reinvent and reimplement all these, so the DSL is likely to be somewhat lobotomized. It may lack the facilities above entirely, or it may have very restricted versions of some of them. These restricted versions are mere shadows of the real thing; you could say that the DSL designer has created a shadow world.
I discussed this phenomenon as part of a talk I gave at Onward in 2013. This post focuses on a small part of that talk.

Here are three examples that might not always be thought of as DSLs at all, but definitely introduce a shadow world.

Shadow World 1: The module system of Standard ML.

ML modules contain type definitions. To avoid the undecidable horrors of a type of types, ML is stratified.  There is the strata of values, which is essentially a sugared lambda calculus. Then there is the stratum of modules and types. Modules are called structures, and are just records of  values and types. They are really shadow records, because at this level, by design, you can no longer perform general purpose computation. Of course, being a statically typed language, one wants to describe the types of structures. ML defines signatures for this purpose. These are shadow record types. You cannot use them to describe the types of ordinary variables.

It turns out one still wants to abstract over structures, much as one would over ordinary values. This is necessary when one wants to define parameterized modules.  However, you can’t do that with ordinary functions. ML addresses this by introducing functors, which are shadow functions. Functors can take and return structures, typed as signatures. However, functors cannot take or return functors, nor can they be recursive, directly or indirectly (otherwise we’d back to the potentially non-terminating compiler the designers of ML were trying so hard to avoid in the first place).

This means that modules can never be mutually recursive, which is unfortunate since this turns out to be a primary requirement for modularity. It isn’t a coincidence that we use circuits for electrical systems and communication systems, to name two prominent examples.  

It also means that we can’t use the power of higher order functions to structure our modules. Given that the whole language is predicated on higher order functions as the main structuring device, this is oddly ironic.

There is a lot of published research on overcoming these limitations. There are papers about supporting restricted forms of mutual recursion among ML modules.  There are papers about allowing higher-order functors. There are papers about combining them. These papers are extremely ingenious and the people who wrote them are absolutely brilliant. But these papers are also mind-bogglingly complex.  

I believe it would be much better to simply treat modules as ordinary values. Then, either forego types as module elements entirely (as in Newspeak)  or live with the potential of an infinite loop in the compiler. As a practical matter, you can set a time or depth limit in the compiler rather than insist on decidability.  I see this as a pretty clear cut case for first class values rather than shadow worlds.

Shadow World 2: Polymer

Polymer is an emerging web standard that aims to bring a modicum of solace to those poor mistreated souls known as web programmers. In particular, it aims to allow them to use component based UIs in a standardized way.

In the Polymer world, one can follow a clean MVC style separation for views from controllers. The views are defined in HTML, while the controllers are defined in an actual programming language - typically Javascript, but one can also use Dart and there will no doubt be others. All this represents a big step forward for HTML, but it remains deeply unsatisfactory from a programming language viewpoint.

The thing is, you can’t really write arbitrary views in HTML. For example, maybe your view has to decide whether to show a UI element based on program logic or state. Hence you need a conditional construct. You may have heard of these: things like if statements or the ?: operator. So we have to add shadow conditionals.

<template if="{{usingForm}}">

is how you’d express  

if (usingForm) someComponent;

In a world where programmers cry havoc over having to type a semicolon, it’s interesting how people accept this. However, it isn’t the verbose, noisy syntax that is the main issue.

The conditional construct doesn’t come with an else of elsif clause, nor is their a switch or case. So if you have a series of alternatives such as

if (cond1) {ui1}
else if (cond2) {ui2}
        else {ui3}

You have to write


<template if = "{{cond1}}">
<ui1>
</template>
<template if = "{{cond2 && !cond1}}">
<ui2>
</template>
<template if = "{{cond3 && !cond2 && !cond3}"}>
<ui3>

</template>


A UI might have to display a varying number of elements, depending on the size of a data structure in the underlying program. Maybe it needs to repeat the display of a row in a database N times, depending on the amount of data. We use loops for this in real programming. So we now need shadow loops.


<template repeat = "{{task in current}}">

There’s also a for loop


<template repeat= "{{ foo, i in foos }}">

Of course one needs to access the underlying data from the controller or model, and so we need a way to reference variables. So we have shadow variables like

{{usingForm}} 

and shadow property access.

{{current.length}}

Given that we are building components, we need to use components built by others, and the conventional solution to this is imports. And so we add shadow imports.


<link rel = "import” href = "...">

UI components are a classic use case for inheritance, and polymer components support  can be derived from each other, starting with the predefined elements of the DOM, via shadow inheritance.  It is only a matter of time before someone realizes they would like to reuse properties from other components in different hierarchies via shadow mixins.

By now we’ve defined a whole shadow language, represented as a series of ad hoc constructions embedded in string-valued attributes of HTML.  A key strength of HTML is supposed to be ease-of-use for non-programmers (this is often described by the meaningless phrase declarative). Once you have added all this machinery, you’ve lost that alleged ease of use - but you don’t have a real programming language either. 

Shadow World 3: Imports

Imports themselves are a kind of shadow language even in a real programming language. Of course imports have other flaws, as I’ve discussed  here and here, but that is not my focus today. Whenever you have imports, you find demands for conditional imports, for an aliasing mechanism (import-as) for a form of iteration (wildcards).  All these mechanisms already exist in the underlying language and yet they are typically unavailable because imports are second-class constructs.

Beyond Criticism

It is very easy to criticize other people’s work. To quote Mark Twain:

I believe that the trade of critic, in literature, music, and the drama, is the most degraded of all trades, and that it has no real value 

So I had better offer some constructive alternative to these shadow languages. With respect to modularity, Newspeak is my answer. With respect to UI, something along the lines of the Hopscotch UI framework is how I’d like to tackle the problem. In that area, we still have significant work to do on data binding, which is one of the greatest strengths of polymer. In any case, I plan to devote a separate post to show how one can build an internal DSL for UI inside a clean programming language. 

The point of this post is to highlight the inherent cost of going the shadow route. Shadow worlds come in to being in various ways. One way is when we introduce second class constructs because we are reluctant to face up to the price of making something a real value. This is the case in the module and import scenarios above. Another way is when one defines an external DSL (as in the HTML/Polymer example). In all these cases, one will always find that the shadows are lacking. 


Let’s try and do better.

Friday, August 29, 2014

Taking it to Th’emacs

Emacs is my preferred text editor. I don’t use old-fashioned text editors as much as I used to, because I often need more specialized tools.  I use IDEs for various programming languages and other things when producing high quality documents. And yet, I often wish I could subsume these with a tool that had the basic goodness of emacs.

What makes emacs interesting all these decades after its inception is not what keyboard shortcuts it supports or what its basic editing functionality is. Rather, what matters are some of its underlying design principles.
  • Emacs has built-in scripting language. 
  • That language (elisp), while it has many flaws, is a flexible, dynamic, and rather general purpose language. 
  • All of the editor’s functionality is exposed via APIs written in the scripting language.
As a consequence, you can control everything emacs does programmatically. This makes emacs extensible in a way that is far deeper and more powerful than a plug-in architecture. A plug-in architecture must anticipate every desired extension to the system. Since it never can, it always disappoints in the end. Along the way it grows ever more complex and bloated in its futile attempt to foresee every possible need. With a language, you can code any extension you need.

If the scripting language is truly dynamic, and allows you to not only extend but also modify the running system, the possibilities are truly unlimited.

The points above are not limited to editors. They are fundamentals of system design.

There was a time when even Microsoft recognized this, making apps that could be programmed via VisualBasic. Sadly, they concocted a security nightmare, because malware can also control your application. Which is why being able to secure your application’s scripting language is critical as well. 

Last June, I spoke at QCon NY, and demonstrated a number of interesting web based systems that had some these properties (as well as a couple that did not, but were interesting for other reasons):
  • Leisure is a presentation manager that incorporates a lazy functional programming language. Modulo some person-years of engineering, it is to PowerPoint what emacs is to NotePad. 
  • The Lively Kernel could be thought of as a GUI builder scripted via Javascript; this an insult to Lively of course, because it is much more than that.  In Lively, the GUI is the GUI builder; every GUI you make is extensible and modifiable in the same way.
  • Lastly I showed Minibrowser, a prototype web based IDE for Newspeak. Like all IDEs in the Smalltalk tradition, it can be extended and modified from within itself.
We really need an Emacs for the modern age.  An editor, surely, but one that lets you edit rich text, images, audio and video. In fact, you should be able to embed arbitrary widgets. And of course it needs to be scriptable I just explained. So you might evaluate code that creates a UI element and inserts into the editor.  

Now you can make the editor modify its own GUI. In fact, the editor can be extended into a general purpose GUI builder just like Lively. And every such GUI can modify itself if you wish; sometimes you may wish to modify it so it can no longer modify itself, and then you have a frozen application. Your editor has become an IDE. In fact, it is a live literate programming environment

If the editor’s scripting language interoperates well with the surrounding environment, it can be used to control the computer and everything the computer itself controls. You can check in to the environment and hardly ever leave. You can lead your cyber life in it.: email, social media, live chats, streaming audio and video can all be incorporated. Moreover they can all be controlled and customized by you, the lucky user. 

To a degree, Lively is such an editor. It’s biggest drawbacks are a lack of polish due to lack of engineering resources and that its scripting language is Javascript.

Now, imagine that the editor was polished and robust. Even more importantly the code you created in this environment was modular and secure and written in an elegant and principled language. Imagine you could deploy the same code either on the web, or natively on both desktop and mobile. Imagine that the applications built with the language support online and offline use out of the box, automatically synchronizing data and code between clients and servers. Imagine that they have built-in support for collaboration, either syncing in real time or merging offline as required. 

Of course, it is the vision of such a language and platform that has always motivated the Newspeak project.  I have discussed many of these points before. In particular, I’ve talked about the weaknesses of traditional IDEs (see for example this post and this one), and the need for a platform that supports synchronization over the net (here and again here) for a long time. Yet the message bears repeating. 


Thursday, April 04, 2013

Making Methods Live


A few months ago, I suggested that IDEs should ensure that code is always “live” in the sense that it is associated with runtime data so that any part of the code can be immediately executed. I proposed that tightly integrating editors and debuggers would be reasonable way to pursue the idea.

I’ve put together a prototype, an extension of the Newspeak IDE.  I demonstrated an early version at the WGLD meeting in December. Since then there have been some improvements, though a lot of work remains to be done. Nevertheless, the system is already usable, at least by an experienced Newspeaker.

In Newspeak, one typically edits individual methods in a class browser, as opposed to monolithic files. The prototype modifies the method browsers to present the method along with a view of a live stack frame. This is essentially the same view one sees in the debugger, where a series of activations are available; each such view shows a single stack frame along with the corresponding code

I plan to show the latest version at the upcoming Live 2013 workshop. I’ve prepared a short video illustrating some of the capabilities of the system.  The video is cut short because of time limitations of the workshop, but it shows something roughly similar to one of Brett Victor’s demos of editing and interacting with general purpose code.

It is possible to select any subexpression in the code and evaluate it. In Smalltalk, it is customary to insert code snippets that illustrate how to use an API inside comments. This is possible here as well, but unlike Smalltalk, the snippets can make use of local variables and instance methods. It is also possible to step through code as in a debugger (in the interest of full disclosure, that bit is a tad flakey at the moment; this is very much a work in progress).

There are many things that need improvement, some of which you can see in the demo. Combining the debugger and method editor brings challenges. If you hit return, is that just a newline, or do you evaluate the code, and/or save it? In a classic REPL, you are not editing permanent code and so neither formatting nor saving are a concern, and each return evaluates the current line and moves to a new one. In contrast, in an editor, return is just formatting, and saving is a distinct operation. Our current approach is to keep all three operations distinct. However, it would be convenient to have keyboard shortcuts for evaluation and for evaluate and return. That would make the kind of interaction shown in the demo smoother. So would maintaining the selection across evaluations.

There are also various nice features that aren’t illustrated in the demo due to lack of time. When an evaluation prints out its result the printout is a link to an object inspector on the result, where further evaluation can take place in the context of that object.  This is a feature inherited from the existing Newspeak object inspectors.

It is important to understand that you can do all this on any method of any class, whether you view it in a class browser, or in a list of senders or implementors etc. The goal of this effort is to completely eradicate any code view that does not support such live interaction. There are still some parts of the system where this has not yet been done, but a few more weekends and this will be addressed.

If you get the latest Newspeak VM and the experimental image you can play with the extension I’ve described, though you need to be comfortable with Newspeak. Otherwise, you will probably provoke some of the many bugs in the prototype.

The goal is to get this into the production Newspeak IDE in the not too distant future. There is a good deal of work before we get there, and huge potential for improvement. Issues include efficiency (each method browser is potentially a thread) and the quality of the exemplar data displayed. There are interesting ways to improve the quality, including bidirectional linkage with unit tests and type annotations. There is probably scope for a masters of PhD thesis depending how far one wants to take it all.

While it is a lot easier to do this sort of work in the Newspeak environment,  the lessons learned pertain to other systems as well. 

Saturday, January 19, 2013

Inheriting Class


I wanted to share a nice example of class hierarchy inheritance. Regular readers of this blog are no doubt familiar with the notion of virtual classes and class hierarchy inheritance. For those who aren’t, I’ll recap. Class hierarchy inheritance is exactly what it sounds like: the ability to inherit from an entire class hierarchy rather than a single class.   

Nested classes, as originally conceived in Beta, were treated as dynamically bound properties of instances, just like methods. If you do that, overriding classes (aka virtual classes) is an automatic consequence of nested classes.  If you miss this crucial point, you can easily end up with nested-classes design that has a great deal of complexity and almost no benefits (cf. Java).

The notion of class hierarchy inheritance has been around for many years, but has not yet caught on in the mainstream.  It is supported in a few languages: Beta, gBeta, Newspeak and several research projects.

Tangent:  Apologies to any languages I’ve not explicitly mentioned. This is a blog, not a scientific paper, and exhaustive citations should not be expected.

The research on this topic has too often been focused on typechecking, under the rubric of family polymorphism.  Fortunately, one can make excellent use of class hierarchy inheritance without worrying about complicated type systems.

Note:  Virtual classes are distinct from virtual types.

To be honest, while class nesting has proven to be useful on a daily basis, class hierarchy inheritance occurs pretty rarely. The biggest advantage of late binding nested classes is not class hierarchy inheritance, but polymorphism over classes (acting as instance factories, ergo constructors), which promotes modularity.   

Nevertheless, class hierarchy inheritance can be very useful at times. And since it comes for free, we might as well use it.

A classic example in the research literature is extending a class Graph that has nested classes like  Node and Edge. In Newspeak this would look roughly like this:
class Graph () (
   class Node ...
   class Edge ...
)

One can introduce a new subclass of GraphWeightedGraph, modifying Edge to hold the weight etc.

class WeightedGraph = Graph ()(
   class Edge = super Edge ( | weight ::= 0. | ) ()
)

In WeightedGraph, the class Edge inherits from the class Edge in the superclass GraphWeightedGraph’s Edge adds a slot (aka field), weight, initially set to zero.  Overriding of classes works just like overriding methods, so all the code that WeightedGraph inherited from Graph continues to work, except all uses of Edge refer to the weighted subclass.

In this post, I wanted to mention another nice example - one that arose in practice. Some former colleagues had implemented an external DSL on top of Newspeak, and naturally wanted to provide the nice IDE experience of Newspeak for their DSL as well. In particular, the debugger needed to support the DSL.

For the most part, the debugger is independent of the exact source language it is displaying. The system picks up the source code for each executing method and highlights the current call. However, difficulties arise because some methods created by the DSL implementation are synthetic. At run time, these methods have corresponding stack frames which should never be displayed. We need a small tweak to the debugger UI, so that these synthetic frames are filtered out.

The Newspeak debugger UI is implemented via a Newspeak module (a top level class) that contains classes responsible for the UI of the debugger, which in turn handles the UI of individual stack frames and of a stack as a whole. The debugger uses the Hopscotch UI framework; I’ll summarize salient characteristics here. Hopscotch applications consist of a presenter (the view of MVC), a subject of the presentation (roughly what some refer to as a ViewModel in MVVM) and model (a name everyone agrees on).  And so, our UI includes a ThreadPresenter and a ThreadSubject (whose model is the underlying thread) and a  number of ActivationPresenters and ActivationSubjects (whose model is the context object representing an individual stackframe). The presenters and subjects listed above are all declared within the Debugger class, which is nested in the top-level class (aka module definition) Debugging.

All we need then, is a slight change to ThreadSubject so it knows how to filter out the synthetic frames from the list of frames.  One might be able to engineer this in a more conventional setting by subclassing ThreadSubject and relying on dependency injection to weave the new subclass into the existing framework - assuming we had the foresight and stamina to use a DI framework in the first place. We’d also need to rebuild our system with two copies of the debugger code, and in general be in for a world of pain.

Fortunately,  the Newspeak IDE is written in Newspeak and not in a mainstream language, so these situations are handled easily.  Dependencies that are external to a module are always explicit, and internal ones can always be overridden via inheritance.

So you subclass the Debugging module and override the ThreadSubject class so that it filters its list of activations.

class FilteredDebugging = Debugging () (
   class Debugger = super Debugger () (
      class ThreadSubject = super ThreadSubject () (
        ... code to filter activations ...
     )
   )
)

You can define FilteredDebugging in the context of your complete DSL application.  Or you could define it is a mixin, and just apply it to Debugging in the context of the DSL application.

No DI frameworks, no copied code, no plugin architecture that nobody can understand, and no need to have foreseen this circumstance in advance. It really is quite simple.

Saturday, November 17, 2012

Debug Mode is the Only Mode


There has been a fair amount of discussion recently surrounding some of Bret Victor’s talks and blog posts. If you haven’t seen these, I recommend them highly - with a grain of salt.

These pieces make important points related to programming and programming environments, and are beautifully done. 

They also relate to education and other matters which I will not discuss here.

Because of their exquisite presentation, they’ve elicited far more attention than others making similar points have garnered in the past. This is a good thing.  However, beneath the elegant surface, troubling questions arise.

The demos Victor shows are spoiled by the disappointing realization that we are not seeing a general purpose programming environment that can actually work these miracles for us. Instead, they are hand crafted illustrations of how such a tool might behave. It is a vision of such an environment - but it is not the environment itself. Relatively little is said about how one might go about creating such a thing for the general case - but there are some hints.

We should take these ideas as inspiration and see what one might do in practice. I expect this is one of the things Victor intends to achieve with these presentations. 

Victor recognizes that many of his examples depend on graphical feedback and don’t necessarily apply to other kinds of programming.  However, his use of traces and timelines  is something we can use in general. In one segment, the state during a loop gets unrolled automatically by the programming environment  - morphing time into space so we can visualize the progress (or lack thereof) of the computation.

This specific example might be handled in existing debuggers using a tail recursive formulation of loops - without tail recursion elimination! Then the ordinary view of the stack in a debugger could be used - though the trace based view may have advantages in terms of screen real estate, since we need not repeat the code. Those advantages will apply to any recursive routine, so adding an unfolded view of a recursive call (or a clique of such calls) is a small concrete step one might want to investigate.

Traces that show all the relevant data are intrinsically connected to time traveling debugging, because we want more than selective printouts - we want to be able to explore the data at any point in the trace, following the object graph that existed at the traced point where ever it may lead us.

I firmly believe that a time traveling debugger is worth more than a boatload of language features (especially since most such boatloads have negative value anyway).  

When I first saw Bil Lewis’ Omniscient Debugger, I tried to convince my management to invest in this area. Needless to say, I got nowhere.  

The overall view is that a program is a model of some real or imagined world that is dynamic and evolving. We should be able to experiment on that model and observe and interact with any part of it. One should be able to query the model’s entire history, searching for events and situations that occurred in the past - and then travel back to the time they occurred - or to a time prior to the occurrence, so we can preempt the event and change history at will.

The query technology enabled by a back-in-time debugger could also help make the graphical demos a reality. You ask where in my code did I indirectly call the invocation that  wrote a given pixel. It’s a complex query, but fundamentally similar to asking when did a variable acquire a given value.

There is a modest amount of work in this area, some of it academic, some commercial (forgive me for not citing it all here), but it hasn’t really taken off. It is challenging, because programs generate enormous amounts of transient data, and recording it all is expensive. This gives a new interpretation to the phrase Big Data. Data is however central to much of what we do, and data about programs should not be the exception. 

A related theme is correlating data with code by associating actual values with program variables.  One simple example of the advantage of having values associated with variables is that we can do name completion without recourse to static type information. We get the connection between variables and their values in tools like workspaces, REPLs, object inspectors and (again!)  debuggers, but not when viewing program text in ordinary editors or even in class browsers.  

In Newspeak and Smalltalk, developers sometimes build up a program from an initial sketch using the debugger, precisely because while debugging they can see live data and design their code with that concrete information in mind. You’ll find an example of this sort of thing starting around 19:10 in Victor’s talk, where an error is detected as the code is being written based on runtime values.

The Self environment shows one way of achieving this integration of live data with code. Prototype based languages have a bit of an advantage here because code is tied to actual objects - but once we are dealing with methods that take parameters we are back to dealing with abstractions just like class or function based languages.  

You might even be tempted to say that Javascript, being a prototype based language, is a modern incarnation of Self. Please don’t drink and drive; at best this is a cautionary tale on the theme “be careful what you wish for”.

We need a process that makes it easier to go from initial sketches to stable production code. We’d like start from workspaces and being able to smoothly migrate to classes and unit tests. This is in line with the philosophy expressed in this paper for example.

It seems to me that the various tools such as editors, class browsers,  object inspectors, workspaces, REPLs and debuggers create distinct modes of operation. It would be great if these modes could be eliminated by integrating the tools more tightly.  There would always be a live instance of your scope associated with any code you are editing, with the ability to evaluate incrementally as you edit the code (as in a REPL) and step backwards and forwards as in a time traveling debugger.  The exact form of such a tool remains an unmet UI design challenge.

All of the above holds regardless if whether you are doing object-oriented or functional programming (a false dichotomy by the way) or logic programming for that matter.

Tangent: I'm aware that the notion of debugging in lazy functional languages is problematic. But the need for live data and interactive feedback remains. And once a interactive computation occurred, the timing has been fully determined. So while stepping forward may be meaningless, going back in time isn't.

We should stop thinking of programs as just code. The static view of code, divorced from its dynamic extent, has been massively overemphasized in the PL community. This needs to change, and it will.

Thursday, August 23, 2012

Newspeak on Dart

We've wanted to run Newspeak to run in a web browser for a long time - actually, since before the Newspeak project started.  The need for better programming languages for the internet platform has been evident for a while.  In a better world, the Newspeak project would be focused on that from day 1.  

However, this world is suboptimal.  It was only in 2010 that I spent some time sketching out the first version of NS2JS, the Newspeak-to-Javascript compiler. It was a toy, but in 2011, Vassili Bykov managed to bring it to the verge of self hosting. Performance, however, was abysmal.  Having since joined the Dart project and learned more and more about the challenges Javascript poses as a compilation target (or anything else) I guess this really should not have been a surprise.

We were able to reuse a part of that effort this summer when Ryan Macnak came to Google in Mountain View as an intern with the Dart team. His mission: NS2Dart, a Newspeak to Dart compiler. A detailed report is available at

The results are promising, though there is still a long way to go. Despite considerable losses due to impedance mismatches, on the Dart VM, NS2JS comes within a factor of 2 of the Squeak based Newspeak implementation (NS2Squeak). Given that the current version of NS2Squeak is twice the speed it used to be (thanks to Cog)  it looks like performance is tolerable already. And since Dart will get much faster (the numbers in the tech report are already out of date), the future looks bright.

Now, when someone says the future looks bright, you should be getting nervous. So, just sign the dotted line, and then I'll get to the caveats ...

The current version relies on access to Dart VM's embedding API, which means you cannot run it in the browser. Hopefully, over time enough functionality will get added in the way of mixins and mirror builders that the same results can be accomplished in pure Dart. That is not yet certain, but let's retain the uncharacteristically optimistic tone of this post, and assume that does happen.

Since most browsers are likely to run Dart via translation to Javascript for some time, we need to look at the NS-Dart-JS pipeline: compiling Newspeak via NS2Dart, and feeding the result to dart2js (The Dart to Javascript compiler) to obtain Javascript code that runs in any browser.

Early on, NS2Dart could run on top of dart2js (by cheating on things like dynamic mixin application) and results were much better than with NS2JS. This is to be expected: dart2js does the heavy lifting for us.  The dart2js compiler already has 4-5 times as much code as NS2JS and is being developed by a very skilled team who understand JS performance inside and out.  Furthermore, it has taken a while to get to this point and performance work is by no means done yet.  So I don't feel too bad about our efforts on NS2JS.

Nevertheless, there is a lot of uncertainty over how well Newspeak can run on top of current Javascript implementations. A key problem area are non-local returns (NLRs). These are essential for user defined control constructs. They are not supported in Dart precisely because it is not clear how efficiently they can be implemented on top of Javascript.

Mirror support on NS2Dart is very partial - the mapping to Dart's mirrors is pretty clean so far, but Dart mirrors are a work in progress, and so far only cover introspection. 

Then there remains the small matter of the UI. I'd love to see the Hopscotch GUI on the browser, but that is a ton of work as well.  We'll see what UI solutions shape up in that space - we can always just call out to whatever UI Dart exposes (Ryan also implemented a Dart Alien API). So altogether, it might be a tad premature to declare victory, but we are making progress.

The dream remains to get a fully functioning system, Hopscotch, IDE and all that works well in the browser. However, I can imagine that many applications could get by with a lot less.  

The details are in Ryan's report and the code is in the Newspeak repo, where you can also get the latest Newspeak images with which to view it (I recommend you do that rather than use the old release; a new release will be out very soon). Thanks are due  to Vassili (whose work on NS2JS made this possible)  and to my colleagues on the Dart team, who graciously supported this effort in various ways, and most of all to Ryan.

Altogether, getting a top quality programming experience on the web requires a major effort.   Dart is beginning to make this possible.