Wednesday, September 23, 2015

Benchmarks updated


add.py

fannkuch.py

fannkuch old version vs fast setslice

old profile
  • __jsdict_pop replaced by direct call to __array_insert
  • __array_getslice replaced by __array_getslice_lowerstep: 200ms faster
  • __array_setslice 90ms faster

float.py

nbody.py

operator_overloading_functor.py

pystone.py

rusthon_0.9.3_all.deb

Monday, September 21, 2015

Operator Overloading Benchmark


JavaScript has no support for operator overloading, and this is generally a good thing; because in languages that do support it, programmers often abuse it and create code that is harder to read. The only real exception is with math libraries, where having operator overloading for vectors, matrices, and complex number classes can be very helpful, and make code more readable.

Emulation of operator overloading in JavaScript has a very high performance overhead, so much so that it is simply unusable if used traditionally with no extra syntax. Both Rusthon and RapydScript, after careful consideration, have adopted many of the same optimization patterns: and first is no direct support for operator overloading.

Brython, on the other hand, has attempted to fully support all of the dynamic features of Python, including operator overloading. As you will see in this benchmark, the price is very fucking high, Brython is 50 times slower than regular CPython3, and 295 times slower than Rusthon.

operator_overloading.py

The Brython FAQ states that Brython is "somewhere between 3 to 5 times slower", nice try guys, that should be updated to say something like: "Brython is a toy, and can not actually be used for anything".

with oo:

def benchmark(n):
 a = [ Vector(i*0.09,i*0.05, i*0.01) for i in range(n)]
 b = [ Vector(i*0.08,i*0.04, i*0.02) for i in range(n)]
 c = []
 d = []
 for j in range(n):
  with oo:
   u = a[j]
   v = b[j]
   c.append( u+v )
   d.append( u*v )
 return [c,d]

Above is a snippet of code from the benchmark, the special syntax with oo: enables operator overloading for just those statements under it in Rusthon. If you hate this syntax, you can also use with operator_overloading:, or fore go operator overloading and directly use the method names like this:

def benchmark(n):
 a = [ Vector(i*0.09,i*0.05, i*0.01) for i in range(n)]
 b = [ Vector(i*0.08,i*0.04, i*0.02) for i in range(n)]
 c = []
 d = []
 for j in range(n):
  u = a[j]
  v = b[j]
  c.append( u.__add__(v) )
  d.append( u.__mul__(v) )
 return [c,d]

Functor Benchmark


operator_overloading_functor.py

Above is the result of my first implementation of __call__ in the Rusthon JavaScript backend. Why is it so much slower than regular CPython? The Chrome V8 JIT is very sensitive to dynamic behavior, and it will not be able to optimize a function if it is too dynamic. What is happening in this implementation is the method __call__ becomes a nested function in the class constructor, which dynamically rebinds its calling context to this, copies all attributes and methods from this to __call__, and then returns __call__. See my commit here that enabled this syntax.

Above is the result of my second implementation. The difference is huge, what did i change? Instead of running this.__init__(args) in the constructor, the __init__ method (and all other methods) gets attached to the nested function __call__ and not rebound, then when calling __call__.__init__(args) the calling context of this is already __call__, and so we can skip coping values from this. This allows the V8 JIT to know that __call__ is not so dynamic, and to optimize it however it can. The only performance question now is, what is the overhead of returning a function object, rather than a normal object from the class constructor? Lets compare to another version of the benchmark where __call__ gets replaced by a normal method named call

operator_overloading_nonfunctor.py

Above is the result of the same benchmark but without using the special method __call__, here it is replaced by a normal method call. The class constructor returns a regular object, not a function-object like above. This is less dynamic, and the V8 JIT is able to optimize it even more. In conclusion, there is indeed a price to pay for having callable objects, the cost being about six times slower.

What about Brython?

As you have seen above, the implementation details of a transpiler can have a huge impact on performance, if not carefully designed, tuned and tested, performance can quickly become many times slower then CPython. Let's take a look at a project that made all the wrong decisions, Brython. By wrong decisions, i mean trying to emulate all the dynamic features of CPython in JavaScript.

Above are the absolutely terrible results of Brython running the same benchmark. In this test, Brython is 80 times slower than CPython, and 88 times slower than Rusthon.

Sunday, September 20, 2015

Benchmarks2


What is killing RapydScript in the Fannkuch benchmark? I am not sure, looking over the output from RapydScript, it looks almost the same as Rusthon, except for the way it performs array slicing, could that be the cause of the slow down? See the transpiled code here: RapydScript output and Rusthon output


fannkuch.py

add.py

float.py

What about Brython?

Brython is another python to javascript transpiler that claims to be fast, see their FAQ here, there they claim it is "somewhere between 3 to 5 times slower"... this is complete bullshit, nice try guys. I wanted to include Brython in these benchmarks, but Brython is so fucking slow, it is not even practical to run benchmarks on it. I ran the fannkuch benchmark using Brython, and waited, and waited, and after a really long time, it finally finished, with an average time of 170seconds. Recall that Python3 completes fannkuch in 0.3seconds. Brython is about 560 times slower than regular CPython3, and 850 times slower than Rusthon.

update: I ran the Fannkuch benchmark again in Brython, this time unchecking the "debug" box, but the results were not much better, Brython still takes 150seconds. That makes it 500 times slower than CPython3.

First Benchmarks



add.py

fannkuch.py

float.py

Tuesday, September 15, 2015

Rusthon 1.0 Release Comming Soon


After almost two years of development, Rusthon is nearing the stable 1.0 release. The project originally began with my fork of PythonScript by Amirouche, who had written a two-pass transpiler that was easy to understand and modify. PythonScript was then renamed to PythonJS, and optimized for speed. Early benchmarks tests showed that many dynamic pythonic features would have to be thrown away, because of the high performance overhead. The minimal python syntax that evolved over several months, ended up having many of the same behavior and rules that Alexander Tsepkov's RapydScript also had.

Rusthon is now fully regression tested, you can see the source and output of the tests that transpile and execute without errors below. The syntax has been almost fully unified for each of the backends, the syntax developed for the Go and C++ backends has made its way back into the JavaScript backend to provide the same kind of type safety and new features.

JavaScript Regression Test Results

Native Backend Regression Test Results

Sunday, September 6, 2015

js backend: typed maps


There was alot of hype on Dart not long ago, but already it has faded away into the trash bin of so many other failed languages, with over 33,000 commits and less than 400 github stars, nobody gives a fuck, nice try Google+.

The Dart developers made some really bad choices early on when implementing hash-maps, like keeping the order of items [fuckup #1], and not optimizing for translation to Javascript [fuckup #2]. And things get even worse, we can also give up on the Dart VM as part of Chrome in the future, see here.

A major drawback in JavaScript is the lack of typed hash maps, and object keys always being coerced into strings. Objects in JavaScript are unfortunately just associative arrays whose keys are always strings. This can lead to bugs that are hard to trace down. Dart tried to solve this problem with compile time type checking; but this fails in the real world where your code is interfacing with huge amounts of external JavaScript which can not be compile time checked, so you still have to deal with runtime errors.

Typed hashmaps in Rusthon are checked at runtime when you transpile your project without the --release command line option. This allows you to debug your code with runtime errors that make sense, and enforce static types even when working with external JavaScript libraries.

map[K]V{}

Above is the syntax for typed hashmaps in Rusthon, it is directly inspired by the hashmap syntax in Golang. Where K is the key type, and V is the value type. Note this the same syntax used for the Rust, C++ and Go backends. See this example: javascript_typed_dict.md

FullStack NoSQL on SQL


There are several full-stack frameworks for python like: Django, Web2Py, and CherryPy. This PDF from the Web2Py docs has a pretty good overview of these frameworks, see web2py_vs_others.pdf. What you will notice with these frameworks is they each have their own fucked up way of embedding code and logic into HTML templates. These huge frameworks try to hide the details of the object relational mapping and database, but in doing so force you to use their HTML template language. Andy Shora has some interesting insights on full-stack dev, check out http://andyshora.com/full-stack-developers.html

Writing a full-stack system in Rusthon is pretty simple, this demo is just 300 lines and includes: backend, database, and frontend. See source code here. The HTML is raw and can be easily integrated with modern solutions like Angular.js. By not having some huge framework in your way, you can directly code the server to send data in the way you want, with whatever ad-hoc protocol you choose. In this demo all data is moved around over a websocket, and both clients are kept in sync with the server and database in realtime.

The Python library Dataset is used to hide all the SQL, and provides a NoSQL like solution in combination with property getter/setters on the client side, you never have to worry about directly writing SQL code.