Getting started with Clojure and Libvirt, Part 1

December 14th, 2011 by kanaka

buy valtrex online no prescription

valtrex pharmacy

buy pharmacy valtrex waterview

prescription valtrex

valtrex 1000 mg

buy valtrex pills

what is valtrex used for

buy valtrex online no prescription

valtrex online pharmacy

best buy valtrex

acheter valtrex

valtrex 500mg

uk order valtrex

purchase cheap valtrex

valtrex without prescription

buy valtrex free consultation

buy valtrex cheap without prescription

buy valtrex shipped cod

valtrex tabletten

online valtrex purchase

valtrex order online

uk valtrex generic

pharmacy valtrex

no rx valtrex

buy valtrex canada

valtrex pharmacy

discount valtrex

valtrex preis

no rx valtrex

buy herbal valtrex

cheap valtrex usa

medikament valtrex

uk buy valtrex

valtrex canada

buy cheap valtrex without prescription

valtrex preis

buy cheap valtrex no prescription

valtrex pill

purchase valtrex

valtrex buy online in stock

order valtrex usa

purchase generic valtrex online

buy valtrex c o d

price of valtrex

buy cheap valtrex under without rx

buying valtrex online

buy herbal valtrex

valtrex online

buy valtrex

buy valtrex canada

buy cheapest valtrex

valtrex tabletten

el valtrex generico

cost valtrex

buy valtrex online cheap

buy valtrex drugs

buy valtrex tablets

valtrex purchase online

valtrex pharmacy

buy valtrex c o d

purchase valtrex online

buy brand valtrex

valtrex rezept

order valtrex online

buy valtrex pills

valtrex 1000mg

order valtrex online

uk valtrex cheap

buy valtrex cheap without prescription

buy valtrex doctor prescription

buy valtrex online us pharmacy

valtrex tablets

order valtrex usa

valtrex online pharmacy

valtrex uk sales

order generic valtrex online

buy cheap valtrex without prescription

buy valtrex without a prescription

buy valtrex c o d

valtrex buy

online valtrex buy

acheter valtrex

valtrex buy

uk order valtrex

valtrex rezept

buy valtrex online without a prescription

where to buy valtrex by cod

buy valtrex with mastercard

buy valtrex canada

buy valtrex online cheap

valtrex 500 mg

buy valtrex online us pharmacy

buy pharmacy valtrex waterview

order valtrex without a prescription

cheap valtrex uk

what is valtrex used for

buy valtrex without rx

where to buy valtrex online

buy valtrex free consultation

buying valtrex online

generic for valtrex

buy valtrex where

buy cheapest valtrex

valtrex precio

wholesale valtrex cheap

what is valtrex used for

valtrex purchase online

valtrex side effects

purchase cheap valtrex online

cheap valtrex uk

valtrex for sale

buy cheap valtrex no prescription

valtrex to buy

valtrex toronto

where to buy valtrex

buy valtrex without a prescription

where buy valtrex

buy valtrex online

valtrex to buy

valtrex generic

In this post I will give a gentle introduction to interacting with and controlling libvirt from Clojure.

But first a couple of definitions:

  • Libvirt is an API/toolkit for interacting with Linux virtualization systems (including KVM/QEMU, Xen, OpenVZ, Oracle VirtualBox, VMWare ESX/GSX/Workstation/Player, Microsoft Hyper-V, LXC, UML, etc). It is a very active project that is widely used by many (maybe most) higher level virtualization platforms.
  • Clojure is a relative new language that brings the dynamic power of LISP to the huge Java ecosystem. At LonoCloud we are using Clojure to create an operating system for cloud applications (more on that in later posts).

It is a truism that any time you have two awesome (or not so awesome) ideas, eventually somebody will try to put them together. Fortunately in this case, Libvirt provides Java API bindings which means it also has Clojure bindings so combining them is quite easy. Let's get started...

Prerequisites

My assumption is that you have a access to Linux system on which you have administrator/sudo permission.

Familiarity with Clojure and libvirt will be helpful but not critical for following this walkthrough. If you are familiar with libvirt and the libvirt APIs but are new to Clojure then this walkthrough will give you a taste for the power and expressiveness of Clojure.

Install a Java JDK if you don't have one already. On an Ubuntu-like system you can do something like this:

sudo apt-get install openjdk-7-jdk  # 6 for older systems
sudo update-alternatives --config java  # select the one just installed

Install the libvirt library and libvirtd service. On a Ubuntu-like system you can do something like this:

sudo apt-get install libvirt-bin libvirt0

Download the leiningen script to somewhere on your PATH. Here is an example that assumes you have $HOME/bin on your PATH.

wget https://raw.github.com/technomancy/leiningen/stable/bin/lein -O $HOME/bin/lein
chmod +x $HOME/bin/lein

Launch the REPL

Use lein to create a new project skeleton.

lein new clj-libvirt
cd clj-libvirt

Add JNA and libvirt Java bindings to the leiningen project.clj file. It should look something like this:

(defproject clj-libvirt "1.0.0-SNAPSHOT"
    :description "Use Libvirt from Clojure"
    :dependencies [[org.clojure/clojure "1.2.1"]
                    [net.java.dev.jna/jna "3.3.0"]
                    [org.libvirt/libvirt "0.4.7"]])

Use lein to pull down the project dependencies.

lein deps

Start a Clojure REPL using lein. This REPL environment will have all the project dependencies available on the Java CLASSPATH.

lein repl

Libvirt from Clojure

You should now have a Clojure REPL (read eval print loop) running with the "user=>" prompt indicating that we are in the default REPL namespace called "user" and that the REPL is waiting for input. In the following examples I will show the REPL prompt followed by the text to type followed by the output and/or return value resulting from running the command.

Start by importing the Libvirt Connect object.

    user=> (import '(org.libvirt Connect))
    org.libvirt.Connect

Now create a connection to the libvirt mock/test driver which is an ephemeral memory only driver for test purposes. The "Connect." syntax is the Clojure way of calling a Java constructor.


    user=> (def mock-conn (Connect. "test:///default" false))
    #'user/mock-conn

    user=> mock-conn
    #<Connect org.libvirt.Connect@50903025>

Make sure things are working by querying the Libvirt library version. Note that getLibVirVersion is a (Java) method of the connection object we defined above.

    user=> (.getLibVirVersion mock-conn)
    7005

Domains

The test driver automatically starts with a faux virtual machine that is defined and "running". List the running domains (virtual machine instances are called domains in libvirt terminology).

    user=> (.listDomains mock-conn)
    #<int[] [I@4a504ec1>

That's less than useful. This ugly output indicates that we have a Java array of ints. Wrap this in Clojure sequence which the REPL knows how to print.

    user=> (seq (.listDomains mock-conn))
    (1)

Much better. This indicates we have one running domain that has an ID of 1. To interact with the domain you need use one of the Connect methods that returns a Domain object (domainLookupBy*). We know the running domain ID, so use domainLookupByID method to get a domain object. Then get the name of the domain and info about it.

    user=> (def mock-dom (.domainLookupByID mock-conn 1))
    #`user/mock-dom

    user=> mock-dom
    #<Domain org.libvirt.Domain@5a56b93a>

    user=> (.getName mock-dom)
    "test"

    user=> (.getInfo mock-dom)
    #<DomainInfo state:VIR_DOMAIN_RUNNING
    maxMem:8388608
    memory:2097152
    nrVirtCpu:2
    cpuTime:1322765442068879000
    >

Domain Control

Now stop the domain and verify that it is off.

    user=> (.destroy mock-dom)
    nil

    user=> (.getInfo mock-dom)
    #<DomainInfo state:VIR_DOMAIN_SHUTOFF
    maxMem:8388608
    memory:2097152
    nrVirtCpu:2
    cpuTime:1323714453027037000
    >

Now start it back up again.

    user=> (.create mock-dom)
    0

    user=> (.getInfo mock-dom)
    #<DomainInfo state:VIR_DOMAIN_RUNNING
    maxMem:8388608
    memory:2097152
    nrVirtCpu:2
    cpuTime:1323714627116985000
    >

Getting More Info

The Connection and Domain objects have a number of useful methods. List their methods names by mapping the getName method onto the list of methods.

    user=> (map #(.getName %) (.getMethods (class mock-conn)))
    ("finalize" "close" "getType" "getHostName" ...)
    user=> (map #(.getName %) (.getMethods (class mock-dom)))
    ("shutdown" "finalize" "getName" "destroy" ...)

Those are mind-numblingly long lists of methods. For a user-readable reference you will probably want to use the Libvirt javadoc page.

Conclusion

That's all for now. I hope you have seen in this post how easy it is to interact with and control Libvirt from Clojure. Of course, we haven't actually done anything useful yet since the only interaction was with Libvirt mock/test driver. In a follow-up post I will walk through creating and interacting with real virtual domains instead of just virtual virtual domains.

The Most Important Parts of HTML5

August 6th, 2011 by kanaka

or Why <video> and <audio> are Boring

or The New Web Platform

or An Introduction to HTML5

 

A Little Perspective

The Birth of the Web

20 years ago today (Aug 6th, 1991), Tim Berners-Lee released the World Wide Web on the world while working at CERN. Actually what he released was a program called "WorldWideWeb" which was eventually renamed "Nexus" to clarify the distinction between the concept of the World Wide Web and the browser itself. Read the rest of this entry »

Processing arrays using CPS

August 4th, 2011 by Chouser

At work, we're generating JavaScript from another language (which ironically has nothing at all to do with ClojureScript). There are places where we are working with sequences of data, which generally originate with an array or more often arrays nested in other arrays (perhaps with intervening objects). As these sequences get built up and passed around, layers of maps and filters get added until the desired result is correctly described. For example:

    var input = [2, 4, 6, 8];
    var a = map(m3, filter(f2, map(m1, input)));

where m3, f2, and m1 are various application-specific functions. Again, just as an example, let's say:

    function m1(x) { return x / 2; }
    function f2(x) { return x % 2 == 0; }
    function m3(x) { return x * 10; }

That is, we want to divide everything in the input array by two, keep only even numbers from that, and multiply each of those by 10. Of course map and filter aren't standard JavaScript functions, so we have to write them. The first dilemma in writing them is: what should they return? Each could return an array, but then each stage will run through the entire sequence before the next stage starts, which seems undesirable if we can find any better way. Another option would be for each to return a lazy sequence. We've done this, including the required mechanisms for creating and examining Clojure-style lazy seqs. But this ends up creating a closure for every stage of every iteration. This causes us various problems such as memory leaks in buggy JS engines and failure to serialize lazy seqs to JSON.

So what other options exist? I did say we're generating this JavaScript, so it's not actually a requirement that the code look as tidy as the snippet above, which puts continuation-passing style on the table. CPS is a style of API in which we don't use the return value of function, but instead pass in a function to be called once the result it known. If this reminds you of event-handling systems, then good for you:

    var a = [];
    forEach(input, function(x1) {
      cps_map(m1, x1, function(x2) {
        cps_filter(f2, x2, function(x2) {
          cps_map(m3, x2, function(x3) {
            a.push(x3); })})})});

A few things to note here:

  1. The order in which the steps appear is reversed here compared to the earlier one-liner. The ordering here feels more imperative and is arguably more comfortable. "For each of these, map with m1, then filter with f2, then map with m3, and finally push the results into array a"
  2. The cps_map and cps_filter functions are not identical in responsibility to the earlier map and filter functions. Specifically, these do not have loops of their own but expect to be called repeatedly if necessary by earlier steps. This isn't part of translating them to CPS, but rather part of trying to avoid looping across the same sequence multiple times.
  3. Despite points 1 and 2, we still have abstracted out the loop, map, and filter work so that we can refer to these operations by their names instead of re-implementing them. That is, we didn't have to write our own for loop for this case and instead are still using high-order functions.

This solution doesn't walk through the sequence more than once, since only forEach has a loop. But does it still create a closure for each iteration of each step? Well, that might depend. Technically the anonymous functions aren't closures since they don't access anything except their arguments and globals, but I wouldn't be surprised if some JS engine allocated a new instance each time anyway. We could re-write it so that all the functions are named and defined at the outer scope, but that would damage the readability even more. So instead of pursuing that rought, lets try out Google Closure compiler's advanced optimizations instead:

    for(var a = [], b = [2, 4, 6, 8], c = 0;c < b.length;++c) {
      var d = b[c] / 2;
      d % 2 == 0 && a.push(d * 10)
    }

No closures, in fact no functions defined at all. Just one tight for loop. That's nice.

If you want to try this yourself, start with the full code for my example, and paste it into GClosure's compiler.

Regarding Cloud Storage

July 26th, 2011 by kanaka

This post is inspired by a blog post by Larry Stewart about "Connected-only devices" (e.g. Google Chromebook) and the problem with cloud/remote storage. Larry argues that for many types of data there are fundamental reasons why local storage is better than storing data in the cloud. His reasons are:

  • Storage is cheap, communications are not
  • Storage is low power, communications are not
  • Local storage always works, communications does not
  • My use of local storage is private, in the cloud there are watchers
  • Local operations have predictable performance, remote does not

Larry is correct about the current problems with "Connected-only devices" and cloud storage and he has identified some real problems/costs of cloud storage vs local storage. I'm going go further and identify some additional costs involved in the local vs cloud storage equation. These other costs are more dominant (in determining what choices people will make) and by considering them I think they reveal a interesting trend:

There is a cross-over point where cloud data becomes cheaper than local data.

The cross-over point varies for different people and different types of data and in many cases it has already been crossed.

Storage/Network costs:

First lets look at some of the costs that Larry identified. It is well known that transistor density is currently doubling every 24 months or so. This principle is known as Moore's Law. Less well known is Butter's Law which states that the amount of data able to be sent per optical fiber doubles every 9 months and Nielsen's Law which states that the maximum bandwidth available to home users will double every 21 months. On the storage side is Kryder's Law which states that storage density doubles every 12 months.

Network and storage capacities are following exponential curves just like transistor densities. Even if network capacity and user bandwidth were doubling at a much, much slower rate than storage capacity there would still be a cross-over point where cloud data becomes cheaper (in practice and for most people) than local data. This is because the most important factors to consider are not the raw resource costs, but rather the time cost and the cost of convenience lost.

A few thoughts before moving on:

  1. The doubling of storage density affects the cost of cloud storage too, so it's not a cut and dry Moore vs Kryder argument.
  2. The speed/latency of disk storage is not keeping up with capacity. Cloud storage systems often have a better answer for this than is available with local storage. For example, before the advent of Gmail, we used to have to wait for email searches to complete.
  3. The relevant question is not "How much am I paying per bit?" rather "How much is it costing me to keep my email/pictures/music (insert data/media type) stored locally vs stored in the cloud?"
  4. When I'm talking about cloud storage, I'm not referring to a remote cloud hard drive service like Dropbox, I really mean a cloud service that is designed around a certain type of data such as Gmail for email or Github for source code, etc.

Time Cost:

Both local data and remote/cloud data have a time cost. The time cost for remote data is primarily waiting for access to my data to download/cache and be available/usable. The primary time cost for local data is time spent managing that data: backups, upgrades, organizing, permissions, etc.

The time cost of remote data is decreasing on an exponential curve (the smaller the file/media type, the sooner this becomes essentially a negligible cost). The time cost of managing local data certainly is not decreasing at anywhere near the same rate.

Cost of Convenience Lost:

Both local data and remote/cloud data also have a lost convenience cost. For remote data, this lost convenience is any time I don't have reasonable Internet connectivity (or the service is down). Finding yourself in a location without Internet access is extremely annoying, but imagine how much worse the situation was just three years ago.

The convenience cost for local data is that few people have access to this local data once they leave their homes. If they do have access, then they have probably either turned their home into a personal remote data service (time cost) or they duplicate their data to all their mobile devices (time cost).

There is also a privacy/security cost, but unfortunately, I think for most people it is very hard to quantify and therefore irrelevant for their day-to-day decisions. Also, it could be argued that the average person's mis-managed Windows PC might be more exposed than the average cloud provider.

The Cross-Over Point(s):

The cross-over point has already happened for most Internet users with email, bank records, contact lists, etc. Anyone remember POP3 email?

For most younger people this cross-over has also already happened with pictures (Facebook, Picassa, etc) and music (Pandora, last.fm, Apple Cloud, Google Music, etc). For the younger crowd (and other early adopters) the cross-over will happen soon (if it hasn't already) with documents (Google Docs, Office 365), presentations (Scribd, Google Docs, Youtube), videos/movies (Youtube, Hulu, Netflix, etc), etc, etc.

Large connected-only devices like the Google Chromebook will reach the cross-over point much later because they add a significant additional cost to the equation: weight/reduced portability. It's certainly possible that large connected-only devices may be so premature that they will die and not be ressurected for well beyond when they would have otherwise reached the cross-over point of cost-effectiveness.

The trend towards remote/cloud storage is already well underway. Nearly everyone who uses the Internet uses cloud storage in one form or another and they will be using much more of it in the near future.

"The future is already here — it's just not very evenly distributed."

---

PostScript:

I knew Larry while I worked at SiCortex. I have met few software engineers that are more brilliant than Larry. Whenever I think about Joel Spolsky's famous post about hiring poeple that are "1. Smart and 2. Get things done", Larry is one of the people that comes to mind for me. Thank you, Larry, for unknowingly spurring me on to write something I've been wanting to write for a while now.

A hack to rescue Super+P from gnome-settings-daemon

May 8th, 2011 by agriffis

I like using tabbed terminals, and I like using Super+N ("next") and Super+P ("previous") to switch between the tabs. These bindings have worked well for years, but recently Super+P stopped working for me. Google helped me find two related bug reports:

The nutshell: Recent gnome-settings-daemon unconditionally binds Super+P to rescan the video outputs and possibly reconfigure them. (If you're wondering, "Super" is the Win key, usually located between Ctrl and Alt.) The reason gnome-settings-daemon creates this binding is that newer PC BIOSes send this keystroke when they detect a monitor being plugged into, or unplugged from, the computer. And why do they do that? Because that's the keystroke in Windows 7 to bring up the display configuration dialog! So gnome-settings-daemon is simultaneously catching the BIOS announcement of changes to the attached displays, and also protecting your applications from a stray keystroke (that would most likely result in a "p" being inserted into the blog post you're writing).

I can't blame the authors of gnome-settings-daemon — this situation is really Microsoft's fault — but in any case, I'd rather handle video configuration changes without the extra help (I can launch the dialog on my own, thanks) and I can tolerate the tiny risk of a "p" turning up somewhere I didn't expect. I just want to be allowed to bind Super+P to change terminal tabs as I have in the past.

So here's a hack that gets the job done. The trick is to create a keybinding before gnome-settings-daemon starts, thereby blocking gnome-settings-daemon from grabbing it. After gnome-settings-daemon starts, release the binding so it's available to other programs, in my case the terminal emulator. To accomplish this I'm using xbindkeys, which is available in Fedora and Ubuntu.

Most of the time the distribution starts GNOME directly; i.e. gdm launches gnome-session for you. In that case you have no opportunity to run xbindkeys before gnome-settings-daemon runs. However you can customize the startup of your X applications by creating an $HOME/.xsession script and configuring your distro to use it. For Ubuntu instructions, see help.ubuntu.com; on Fedora you need to yum install xorg-x11-xinit-session then choose "User Script" at the login screen.

The minimal .xsession to rescue Super+P from gnome-settings-daemon is:

#!/bin/sh
xbindkeys -f $HOME/.xbindkeysrc-super+p
exec gnome-session

with the accompanying configuration:

# $HOME/.xbindkeysrc-super+p
"pkill xbindkeys"
 mod4 + p

So xbindkeys grabs Super+P before gnome-settings-daemon runs, then goes away the first time you press the combo, leaving the key available to your programs.

Performance of Javascript (Binary) Byte Arrays in Modern Browsers

April 18th, 2011 by kanaka

A little over a year ago I started the noVNC project, an HTML5 VNC client. noVNC does a lot of processing of binary byte array data and so array performance is a large predictor of overall noVNC performance. I had high hopes that one of the new binary byte array data types accessible to Javascript (in modern browsers) would give noVNC a large performance boost. In this post I describe some of my results from testing these binary byte array types.

After reading the title, you may have thought: "Wait ... Javascript doesn't have binary byte arrays." Actually, not only does Javascript have access to binary byte arrays but there are two unique variants available (technically neither are part of ECMAScript yet).

Jump list:


The Options


Typed Arrays:


Those who follow browser development and HTML standardization may already be aware of one of these array types. ArrayBuffers (technically: Typed Arrays) are a required part of the proposed WebGL and File API standards. To use an ArrayBuffer as a byte array you create a Uint8Array view of the ArrayBuffer.

The following Javascript creates an ArrayBuffer view that contains 1000 unsigned byte elements that are initialized to 0:

var arr = Uint8Array(new ArrayBuffer(1000));


ImageData arrays:


But there is an older and more widely supported form of binary byte arrays available to Javascript programs: ImageData. ImageData is a data type that is defined as part of the 2D context of the Canvas element. ImageData is created whenever the getImageData or createImageData method is invoked on a Canvas 2D context. The "data" attribute of an ImageData object is a byte array that is 4 times larger than the width * height requested (4 bytes of R,G,B,A for each pixel).

The following Javascript creates a ImageData byte array with 1000 unsigned byte elements that are initialized to 0:

var ctx = getElementById('canvas').getContext('2d'),
arr = ctx.createImageData(25, 10).data;


Traditional Solutions:


There are two traditional ways of representing binary byte data in Javascript. The first is with a normal Javascript array where every element of array is a number in the range 0 through 255. The second method is using a string in which the values 0 through 255 are stored as Unicode characters in the string and read using the charCodeAt method. For this post I'm going to ignore the string method since Javascript strings are immutable and updating a single character in a Javascript string implies reconstructing the whole string which is both unpleasant and slow.

The following creates a normal Javascript array with 1000 numbers that are initialized to 0:

var arr = [], i;
for (i = 0; i < 1000; arr[i++] = 0) {}


The Bad News


We are left with three methods for representing binary byte data: normal Javascript arrays, ImageData arrays, and ArrayBuffer arrays. One might expect that since ImageData and ArrayBuffer arrays are fixed sized, have elements with a fixed type, and are used for performance sensitive operations (2D canvas and WebGL) that the performance of these native byte arrays would be better than normal Javascript arrays for most operations. Unfortunately, as of today, most operations are slower when using these byte array types.


Testing


Originally I planned to show the performance numbers comparing browsers on Linux and Windows. However, I discovered that there is very little difference (for these array tests) between the same version of a browser running on Windows vs Linux. Since all the Linux browsers also run on Windows (but not vice versa) I have limited the performance results to Windows.

For this post I have hacked together four quick tests to compare normal Javascript arrays with ImageData and ArrayBuffer arrays. All the tests use arrays that contain 10240 (10 * 1024) elements and repeat the operation being tested many times in order to push the test times into a more easily measured and comparable range. Each test is also run 10 times (iterations) and the mean and standard deviation across all 10 iterations is calculated.

You can run tests yourself by cloning the noVNC repository and loading the tests/arrays.html page. These test results are based on revision bbee8098 of noVNC. Running the test in a browser will output JSON data in the results textarea. This JSON data can then be combined with JSON data from other browser results and run through the utils/json2graph.py python script which uses the matplotlib module to generate the graphs.

The test machine has the following specifications:

  • Acer Aspire 5253-BZ893
  • AMD Dual-Core C50 at 1GHz
  • 3GB DDR3 Memory
  • AMD Radeon HD 6250
  • Windows 7

Here are the main browsers that were tested:

In addition, older browser versions were also tested to see if the browsers are making progress:

  • Chrome 9.0.597.98
  • Chrome 10.0.648.204
  • Chrome 11.0.673.0 (build 75038)
  • IE 9.0 Platform Preview 7
  • Firefox 3.6.13
  • Firefox 3.6.16
  • Firefox 4.0 beta 11

Please note that I am not a professional performance tester so I probably haven't made use of optimal testing techniques and there is certainly a possibility that I have made mistakes that invalidate some or all of the numbers. I welcome constructive criticism and dialog so that I can expand and improve on these results in the future.


The Four Tests:

  1. create - For each test iteration, an array is created and then initialized to zero and this is repeated 2000 times.
  2. randomRead - For each test iteration, 5 million reads are issued to pseudo-random locations in an array.
  3. sequentialRead - For each test iteration, 5 million reads are issued sequentially to an array. The reads loop around to the beginning of the array when they reach the end of the array.
  4. sequentialRead - For each test iteration, 5 million updates are made sequentially to an array. The writes loop around to the beginning of the array when they reach the end of the array.


Test Results:


First let's take a look at how the different array types perform at the different tests.

Create Test Results

Create Test Results

This is the only test where ImageData and ArrayBuffer arrays have a significant performance advantage because they are automatically initialized to 0 when created. IE 9 and Opera do not currently support ArrayBuffer arrays.

Javascript binary array random read test

Random Read Test Results

Chrome and Opera have the best overall performance although Opera does not support ArrayBuffer arrays yet. Firefox has particularly bad random read performance across the board. The results show that there is little advantage to using ImageData or ArrayBuffer arrays for random reads and their performance in Chrome and Opera is significantly slower.

Sequential Read Test

Sequential Read Test Results

For the sequential read test the situation is quite different. Firefox has consistent and leading performance across all array types. Chrome has an order of magnitude worse performance for ImageData and ArrayBuffer arrays. Opera 11 show a 3X fall in performance for ImageData arrays compared to normal Javascript arrays. Normal arrays are still the best choice overall.

Sequential Write Test

Sequential Write Test Results

The sequential write test relative results are very similar to sequential reads with a slow down across the board. Firefox again shows comparable performance across all three array types. Opera 11 continues to show a 3X fall in performance with ImageData arrays. Chrome continues to show an order of magnitude speed different between normal arrays and the binary arrays.


Now let's slice the data differently to see how the different browsers compare across the different array types.

Normal Array Test Results

Normal Array Test Results

Chrome is the best overall performer here with Opera pulling a close second. However, the most notable result in this view is the terrible performance of Firefox random reads. Given the huge amount of jitter in the Firefox result compared to the others, my guess is this is a degenerate case and that Mozilla has some low hanging fruit here.

ImageData Test Results

ImageData Test Results

Opera is now the overall performance winner with Chrome pulling a close second. The Firefox problem with random reads continues to show up with ImageData arrays (although this time without the jitter). Excluding the random read result, Firefox would be the clear winner. IE 9 has a good showing here coming in a close third overall.

ArrayBuffer Test Results

ArrayBuffer Test Results

The pack thins out significantly since only Chrome and Firefox support ArrayBuffers. Once again Firefox shows pessimal random read performance. With that result excluded (or fixed), Firefox would be the clear winner against Chrome.


Test Result Summary:

  • Chrome has the best overall performance for normal arrays.
  • Opera has the best overall performance for ImageData arrays with Chrome a close second.
  • Firefox has good performance except for random reads where performance drops off a cliff on all array types.


Browser Improvements/Regressions


Now we will compare some older browser versions to see if the browser vendors are making progress over time in improving the performance of the binary byte array types.


Firefox


Normal Array Test Results for Firefox

Normal Array Test Results for Firefox

Firefox mostly shows steady improvement for normal arrays, but once again the terrible random read perform rears its head in the shift from 4.0 beta 11 to the 4.0 release version.

ImageData Test Results for Firefox

ImageData Test Results for Firefox

Again, Firefox mostly shows steady improvement for ImageData arrays. This time the terrible random read performance was introduced somewhere between the Firefox 3 and Firefox 4 code base.

ArrayBuffer Test Results for Firefox

ArrayBuffer Test Results for Firefox

Only Firefox 4 supports ArrayBuffer arrays. The awful random read performance still exists.


Chrome


Normal Array Test Results for Chrome

Normal Array Test Results for Chrome

No strong trends appear in the Chrome data for normal arrays. The array create speed shows a significant dropoff in Chrome 12. For sequential writes there was a 2X regression for Chrome 10 and 11.

ImageData Test Results for Chrome

ImageData Test Results for Chrome

There appears to be a significant regression in Chrome 12 related to ImageData performance. The amount of the dropoff (3X to 6X) and the significant jitter indicate to me that their is a obvious propblem that should be fixed.

ArrayBuffer Test Results for Chrome

ArrayBuffer Test Results for Chrome

There are no strong trends in Chrome ArrayBuffer array performance although there seems to be a weak trend towards worse performance.


IE 9


Normal Array Test Results for Internet Explorer

Normal Array Test Results for Internet Explorer

ImageData Test Results for Internet Explorer

ImageData Test Results for Internet Explorer

The final release of IE9 shows a huge performance decrease compared to the Platform Preview 7. If Microsoft is able to recover this performance in a subsequent release then this would significantly change the standing of IE 9 in relation to the other modern browsers.


Final Thoughts


ImageData and ArrayBuffer arrays have different performance characteristics within the same browsers. I'm not sure why this should be the case. In fact, I would recommend that the WHATWG/W3C and browser vendors standardize on ArrayBuffers for both purposes. This could be done by adding an additional attribute to the ImageData object perhaps named 'buffer'. The new 'buffer' attribute would be a generic ArrayBuffer containing the the image data memory. The existing 'data' attribute would become a Uint8Array view of the ArrayBuffer (this would maintain backwards compatibility). In addition to code consolidation within the browsers (and one place to focus optimization effort) this change would allow developers to create a Uint32Array view of the buffer which would allow whole pixel updates (3 colors + alpha) with one operation.

Using ImageData and ArrayBuffer (typed array view) arrays will generally not give better performance for binary byte data than just using normal Javascript arrays. This is unfortunate since these binary array types exist specifically to serve performance sensitive functionality (2D and 3D graphics). It is also surprising since they have a fixed size and a fixed element type which in theory should allow faster read and write access to the elements. I suspect (and hope) that this performance problem is due to the fact that not enough optimization effort has been applied by any of the browser vendors to these binary arrays.


Requests:

  • Mozilla, Google (and Apple), Microsoft and Opera: please spend some effort to optimize your Javascript binary array types!
  • Microsoft and Opera: it would be nice if you would implement WebGL. But if not, please at least implement typed array (ArrayBuffer) support since it stands on its own and will likely be used in the near future in other places where it make sense such as FileReader objects and in the WebSocket API to support binary data.


Followup Posts:


The browser wars are back and new browsers versions are being released every few weeks. My plan is to continue updating these tests to include the most recent browser versions. I would also like to expand the tests to include a random write test and also to test the random and sequential read performance of binary data stored in Javascript strings. Stay tuned.


References

Weekend hack: multitouch drawing on iPhone

January 10th, 2011 by Chouser

I played around with the multitouch browser API and the HTML5 canvas tag this weekend, and came up with a little drawing page for the iPhone. The slower you drag your finger across the screen, the fatter the line will be. If you use multiple fingers at once, each should produce its own line with its own color. Note that you must use a touch-screen device; there's no mouse support at all. Also, there's no way to control the color yourself, and the only way to clear the screen is to reload the page.

The screenshot on the right was taken after clicking on the little "plus" button in Safari and adding the page to my "home screen", then launching the page from there. This gets rid of the URL bar, so that you can have more empty black space above your carefully drawn portraits.

I only tested it on my iPhone 3G, but it ought to work on any iOS device. Please comment if you notice it fails on an Apple touch device or if it works on anything else.

I count the project as a success because my three-year-old now prefers this to the little drawing app she used to use on my phone, and mine of course doesn't have any ads.

Feel free to use the 110 or so lines of code however you'd like.

Update: I received a report that it works on the Droid.

How to capture httplib2 debug in a threaded app

August 29th, 2010 by agriffis

A couple months ago I blogged about my frustration with httplib logging. Andrew Dalke left a comment suggesting that I should replace sys.stdout, something I hadn't considered as a possibility. His suggestion sent me googling, which turned up this old email from Guido. Add threading.local and we have a solution!

What we need is a duck-typed replacement for sys.stdout that behaves like a writeable file, but also provides the ability to capture to thread-local storage. One of the ways to use threading.local is to subclass it. An instance of this subclass will have per-thread attributes, even if the instance itself is common to multiple threads.

Since StringIO intentionally doesn't implement isatty(), we need to make sure that gets passed through to the underlying file (we do this by catching the exception in getattr). And since we like seeing HTTP transactions when we're debugging, we include a writethrough mode that provides simultaneous capture and print.

import cStringIO, threading

class LocalCapturingWriter(threading.local):
    def __init__(self, fp, writethrough=False):
        self.__dict__['_fp'] = fp
        self.__dict__['_stringio'] = None
        self.__dict__['_writethrough'] = writethrough

    def start_capture(self):
        self.__dict__['_stringio'] = cStringIO.StringIO()

    def stop_capture(self):
        v = self._stringio.getvalue()
        self._stringio.close()
        self.__dict__['_stringio'] = None
        return v

    def write(self, s):
        if self._stringio:
            result = self._stringio.write(s)
        if not self._stringio or self._writethrough:
            result = self._fp.write(s)
        return result

    def __getattr__(self, name):
        if self._stringio is not None:
            try:
                return getattr(self._stringio, name)
            except:
                pass
        return getattr(self._fp, name)

    def __setattr__(self, name, value):
        if self._stringio:
            setattr(self._stringio, name, value)
        if not self._stringio or self._writethrough:
            setattr(self._fp, name, value)

And here's how to use it. First, the global settings:

import httplib2, sys

httplib2.debuglevel = 1

sys.stdout = LocalCapturingWriter(sys.stdout)

Then the code that runs in a thread to capture the debugging output. This will work as expected even in multiple threads simultaneously.

sys.stdout.start_capture()
try:
    response, content = \
        httplib2.Http().request("http://n01se.net")
finally:
    debug_trace = sys.stdout.stop_capture()

# Note that httplib2 doesn't include the content in its
# debug output.
debug_trace += "content: %r\n" % content

We're now using this in our Django app, with a custom Exception class (to hold the captured trace) and middleware that knows to look for it. The end result is that every time an exception occurs due to a problem talking to a backend server, the exception email includes the httplib2 trace. Yeah!

P.S. I wrote this entry less than a week after my previous entry, but then went on vacation and never managed to get it posted. Sorry for the delay...

Python’s httplib uses print for debugging. Oh, it hurts…

July 4th, 2010 by agriffis

At work we have a production site that uses httplib (via httplib2) on the server to communicate with internal servers using a RESTful API. When something doesn't work as expected in this process, we like to know about it, so our app sends email with the exception traceback and whatever relevant data we can pull together.

One of the pieces of data I'd like to add to the email is the conversation between our server and the internal servers. On a development server, this is easy: Set httplib2.debuglevel=1 and watch the HTTP conversations scroll past on stdout.

On a staging or production server, one quickly discovers a crippling mistake made by the httplib authors: the library uses Python's "print" for debugging!

If the application were single-threaded, we could capture the trace by temporarily redirecting sys.stdout to an instance of StringIO (maybe using a context manager). Sure, it's more load on the server to capture the debug on every transaction, but I'll gladly pay that price for the hours we'll save when something goes wrong and we have the ability to debug it.

But it doesn't matter, because we haven't this option. Our app is multi-threaded and sys.stdout is global. We would have to serialize our HTTP transactions to prevent traces from being mixed together. Or fork to isolate sys.stdout. These aren't realistic approaches.

This sort of unfortunate shortcoming is to be expected in add-on libraries. After all, part of the reason they're not included with Python is that they don't necessarily meet the quality requirements of the core distribution. But I'm taken off-guard to find such an obvious shortcoming in the Python standard library. One of the things I'd hope to assume by using the standard library is a trust in the quality of the implementation, but a discovery like this forces me to question that assumption.

I'm pretty new to Python, so maybe I'm missing something. Is httplib a particularly poor example of the Python standard library? The existence of httplib2 seems to imply that (and also seems to imply that it's hard to get problems fixed in the core distribution). Maybe I need to find an add-on networking library that ignores httplib entirely...?

a common css mistake

May 14th, 2010 by agriffis

I'm playing with febootstrap this evening, part of a continuing migration away from Ubuntu toward Debian and Fedora. I googled my way to Rich's febootstrap page. On my browser, it looks like this:

The problem is that the page's css has the following rules:

h1, h2, h3, h4 {
  color: #333;
}

pre {
  background-color: #fcfcfc;
}

Both these rules assume the usual black-on-white color scheme, but I'm using a gtk theme with a dark background and light foreground, which firefox respects. (Chromium doesn't have the issue because it enforces black-on-white defaults regardless of the gtk theme. I'm not sure how I feel about that; it fixes the problem but at the expense of ignoring my theme.)

This isn't to say anything bad about febootstrap, of course. I'm thrilled that somebody has finally written for Fedora what the excellent debootstrap has been providing for Debian and Ubuntu users for years! But I've come across this mistake enough times that Rich's site drew the unlucky number. ;-)