operator_overloading_functor.py
Above is the result of my first implementation of __call__ in the Rusthon JavaScript backend. Why is it so much slower than regular CPython? The Chrome V8 JIT is very sensitive to dynamic behavior, and it will not be able to optimize a function if it is too dynamic. What is happening in this implementation is the method __call__ becomes a nested function in the class constructor, which dynamically rebinds its calling context to this, copies all attributes and methods from this to __call__, and then returns __call__. See my commit here that enabled this syntax.
Above is the result of my second implementation. The difference is huge, what did i change? Instead of running this.__init__(args) in the constructor, the __init__ method (and all other methods) gets attached to the nested function __call__ and not rebound, then when calling __call__.__init__(args) the calling context of this is already __call__, and so we can skip coping values from this. This allows the V8 JIT to know that __call__ is not so dynamic, and to optimize it however it can. The only performance question now is, what is the overhead of returning a function object, rather than a normal object from the class constructor? Lets compare to another version of the benchmark where __call__ gets replaced by a normal method named call
operator_overloading_nonfunctor.py
Above is the result of the same benchmark but without using the special method __call__, here it is replaced by a normal method call. The class constructor returns a regular object, not a function-object like above. This is less dynamic, and the V8 JIT is able to optimize it even more. In conclusion, there is indeed a price to pay for having callable objects, the cost being about six times slower.
What about Brython?
As you have seen above, the implementation details of a transpiler can have a huge impact on performance, if not carefully designed, tuned and tested, performance can quickly become many times slower then CPython. Let's take a look at a project that made all the wrong decisions, Brython. By wrong decisions, i mean trying to emulate all the dynamic features of CPython in JavaScript.
Above are the absolutely terrible results of Brython running the same benchmark. In this test, Brython is 80 times slower than CPython, and 88 times slower than Rusthon.
No comments:
Post a Comment