The structure of a Mathematica package

I like to build tools to make programming easier.

One of my latest tools is a useful function that plots the "call graph" of a Mathematica package: each vertex is a function, and each represents the fact that the one function calls another. It's a neat way to understand the structure of a large body of a code without "chasing down" function calls by hand.

For example, the graph below* shows the structure of a machine learning package I'm writing. The size of the disks show function code size -- bigger disks for more complex functions. Interestingly, Mathematica allows me to measure this directly, rather than using the usual proxy of the number of lines in the source code for a function. By taking LeafCount of the DownValues of a symbol, I get the number of independent parts in the abstract syntax tree of a function.

Other visual elements also communicate important details: red disks are public functions -- functions that the package exports to external code. White disks, conversely, represent functions imported from other packages. And lastly, recursive functions (by definition) are those vertices with a circular edge that loops back to thesmelves.

*In case you're wondering, the star in the lower left corner is a set of functions that deal with measuring the error of a classifier on a given data set -- this functionality is independent enough that it doesn't need to call any of the other code in the package. The triplet on the lower right is a set of utility functions.

What if Shakespeare had been a composer?

At the NKS summer school , during which I was an instructor, I helped students from diverse backgrounds to implement their ideas in Mathematica . One of the my favorite projects was working with the artist Elizabeth Latta to visualize and computationally explore the famous play The Tempest , by William Shakespeare.

Among the many promising ideas we investigated, two in particular were interesting enough that I'd like to show them off. The beautiful drawings are Elizabeth's work, and I planned and wrote the code.

The first was a technique that used a network of the major characters in the play to indicate the relative importance of their interactions. In this visualization, each character is a node in the network and each interaction is an edge. To be precise, the thickness of the gray bond between two characters represents how often they speak lines in the presence of one another. The size of each portrait indicates the total number of lines each player has. So main characters should appear large, and have strong connections to their frequent stagefellows.

The second technique isn't quite a visualization, its an "auditization". The idea is that each character is assigned a note, and the play is simply, um, played -- every time that character speaks, a note is played. A note is sustained for an amount of time proportional to the number of lines he or she is saying. One last tweak is that I use Mathematica's pattern matching to ensure that long streams of repetitive, boring note patterns are elided somewhat. The general effect is passable music, or at least quite different from most algorithmic music. Take a listen! Just remember, you're listening to the play at about 30x speed!

( download )

One of the things that I think makes this idea work is that a play must already conform to a particular grammatic, scene, and dialogue structure -- a structure that leaves traces on the way the players share the stage. The rhythm of back and forth between two sparring characters cannot be too lop-sided. Multiple characters have to be orchestrated carefully to move the plot forward. Scene changes must permute the characters if there is to be any extended tension. Characters and motifs recur throughout the play is plans are hatched and executed. All of these forces, and others, make the non-local structure of the dialogue interesting -- simultaneously familiar, and unpredictable -- the very same properties a score must have if it is to be interesting to the ear.

I'm going to try this idea out on other Shakespeare plays, and other plays in general, to see if I can discern major differences in the music they can generate.

Graphing your social graph

Everyone talks about social graphs and their value to large companies and advertisers, but where are the actual pictures for individual users ? I got to thinking how one could visualize the "local" part of online social network -- just your friends and followers and their relationships -- and after a few weekends of tweaking and fiddling, I've got a nice Mathematica notebook that does all this and more.

For example, here's what my Twitter user account looks like:

You'll notice that I do not appear in the graph . I already have an explicit and implicit relationship to everyone in the graph, and so to include me would to distort the graph layout without adding any information at all.

Okay, what do all these visual elements communicate?

  1. lines indicate users relationships to each other: solid lines indicate mutual relationships, whereas dotted lines indicate one-way relationships -- the dotted end is the party who doesn't follow back .
  2. disk size shows tweet frequency: the bigger the disk, the more frequently a user tweets.
  3. color indicates a user's relationship to me: gray for users I follow that follow me back, blue for users who don't follow me back, and pink for losers, I mean users, whose overtures I don't return. Just kidding about the loser part.

Or, if you prefer a visual dictionary, try figure this out (hint: I'm green)!

Screen_shot_2011-04-21_at_6

In Mathematica , it's really easy to create interactive visualizations. It's extremely easy to annotate the graph nodes with tooltips that describe an individual user, showing their latest load-on-demand tweets, avatar, and follower information. Here's what one of these tooltips looks like:

Screen_shot_2011-04-23_at_3

But this is only the tip of the iceberg. One can easily visualize conversations between users by simply mousing over the edge that connects them. One can click on a user to tweet at them or to go straight to their twitter page. One can weight edges with the frequency of message exchanged between two users. And so on. With a powerful functional languages like Mathematica and its rich set of dynamic UI elements, it's very easy to take an UX or UI idea and just prototype it , often going from an idea to an implementation in a matter of minutes.

I'll leave you with a gallery of some of the my Twitter friends:

I've been dwelling on the kinds of intuitions developed by anyone who studies mathematics. They're often simple rules of thumbs and ways of thinking about mathematical objects. In fact, they are so obvious that once you've internalized them, it often doesn't occur to you to articulate them again.

For example, one very basic component of real analysis is function composition , something that is probably taught early on in high school -- although .

What does it mean to compose functions? How does one reason about a compound function? I got thinking about how one might go about helping a student to develop their intuition about these questions. It occurred to me that there is a simple visualization technique that answers this exact question without any words at all .

How does this work? Let's say you want to visualize the following compound function:

First, let's consider the syntax tree of this expression. This is a tree in which the root is the entire expression and the leaves are linear or constant functions like x and 1/3 . Luckily, with Mathematica it's pretty easy to present an arbitrary expression directly in this tree representation by using the formatting construct TreeForm :

Treeplot

My idea is to produce actual function plots for each interesting node in this tree. By moving up the tree we can show how these sub-expressions fit together to compose the entire expression . It turns out that it takes about 15 lines of Mathematica to compile the syntax tree, recognize and extract the interesting nodes, and synthesize the corresponding plots into a graphical diagram.

Wrapping this all up into a function called FunctionTreePlot , we can now visualize our example like so:

Func1b

This technique seems to work quite well. You can easily chase visual features of the corresponding plots up and down the tree to answer questions like "why does this function have a pole here" or "what will be the effect of changing this co-efficient?" No doubt this functionality would be a great addittion to Wolfram|Alpha's already strong support for visualizing mathematical functions .

Here are a few more examples from my experiments: