This blog post summarizes a somewhat random excursion into dynamic dispatch implementations in Go, plus a bit of Java. It assumes the audience knows a bit about dynamic dispatch and its compiler implementations (vtable and all that).
Here's what happened -- I stumbled onto an article about Golang (the content is irrelevant); somehow it made me recall that, when I was using Golang for a distributed system course project, there was something interesting about Golang's interface design. I can't remember the details, so I Googled a bit, and ran into this StackOverflow answer.
Languages with methods typically fall into one of two camps: prepare tables for all the method calls statically (as in C++ and Java), or do a method lookup at each call (as in Smalltalk and its many imitators, JavaScript and Python included) and add fancy caching to make that call efficient. Go sits halfway between the two: it has method tables but computes them at run time.
Cool, so how does Go compute these method tables at runtime?
Interface values are represented as a two-word pair giving a pointer to information about the type stored in the interface and a pointer to the associated data. Assigning b to an interface value of type Stringer sets both words of the interface value.
This sounds just like fat pointer for Rust Trait Object. It trades storage and conversion cost for faster method lookup at call sites. I think Haskell typeclass might fit into this camp as well.
Go's dynamic type conversions mean that it isn't reasonable for the compiler or linker to precompute all possible itables: there are too many (interface type, concrete type) pairs, and most won't be needed. Instead, the compiler generates a type description structure for each concrete type like Binary or int or func(map[int]string). Among other metadata, the type description structure contains a list of the methods implemented by that type. Similarly, the compiler generates a (different) type description structure for each interface type like Stringer; it too contains a method list. The interface runtime computes the itable by looking for each method listed in the interface type's method table in the concrete type's method table. The runtime caches the itable after generating it, so that this correspondence need only be computed once.
Hmm...? I understand that there are dynamic cases where we can't precompute the itable (e.g., casting one interface to another), but always computing these tables at runtime sounds expensive? Yeah premature optimization is a devil, but this really got me interested.
Then I looked into the second link from the StackOverflow post, and not surprisingly, Mr. Anonymous had the same question, from 14 years ago.
The relatively expensive operation is assignment to an interface value - but even it is pretty cheap. The first time a concrete type hits an interface type, it builds a hash table entry that points to a vtable. Second and subsequent assignments of the same type will do a much cheaper hash lookup to find the vtable.
Why hash-table? It seems unnecessarily more expensive than an alternative implementation.
And yes, exactly:
The compiler knows all the information necessary to build the interface-object involved in the conversion. Because of this, the fact that Go currently performs some runtime computation and caching to create the interface-object seems a little odd.
There's more discussion, and eventually it went off this topic, but before that, I got what I need, merci Ian.
gccgo does precompute the tables that it knows that it needs. However, there is also a dynamic component: an arbitrary type may be converted to an arbitrary interface, and in that case the method table must be built at runtime. gccgo does this as well.
These are good sanity checks, but now I want to see this in the compiler source code, and luckily there was a clue from the first article linked in the StackOverflow post. It talks a bit about the algorithmic complexity of building up the itable, and left a no-longer-valid code pointer:
we can build the mapping in O(ni + nt) time instead.
If you click on that link, it takes you to the landing page of Golang's github
repo. But if you hover your cursor over that link, you see the hidden clue --
code.google.com/p/go/source/browse/src/pkg/runtime/iface.c#98
, and thankfully
things didn't move around that much, and I found this
iface.go
in the current repo. I think it became a .go
file because the compiler
bootstraps itself now.
Now things will get a bit more technical.
So I cloned the repo at commit
7f76c00f, and started exploring from
iface.go.
and found this getitab
method:
Let's start with the function signature:
getitab
says "here's an object, get me the itable for this interface"inter *interfacetype
is the interface typetyp *_type
is the object type, which makes sense since fetching an itable
shouldn't require the object's instance data.Then notice the super performant and not-confusing double rounds of lookup for
itable, it basically revolves around a find
method, and a global hashmap from
(interface_type, concrete_type)
to itable
:
Then it's the itable allocation and initiation, the worth-looking-at bits is
this itabInit
function. It also implements the aforementioned "build the
mapping in O(ni + nt)
time". Lastly it explains why we have this if m.Fun[0] != 0 {
check at the call site. These intuitive error handling mechanisms
certainly make the code a pleasure to read:
Finally it adds the new itable to the global cache and releases the lock.
Overall there's nothing surprising, time to look at where getitab
gets used.
Luckily, there are only 4 places where getitab
is called directly, and they
basically serve 2 language constructs in Go -- type assertion and type switch.
In Go, type assertions basically does casting.
These 3 functions in iface.go
uses getitab
:
typeAssert
assertE2I
assertE2I2
If you do git grep "typeAssert("
, you will only find 1 declaration and 1
definition. So where is this function invoked? Well if you do git grep "\<typeAssert\>"
, that gives you a few places, and the ones that matter are
cmd/compile/internal/ssagen/ssa.go:144: ir.Syms.TypeAssert = typecheck.LookupRuntimeFunc("typeAssert")
cmd/compile/internal/typecheck/builtin.go:101: {"typeAssert", funcTag, 67},
If you go to builtin.go
, you'll see that we have a runtimeDecls
for a bunch
of go functions including typeAssert
, the name suggests that it enables
accessing these functions at runtime rather than compile time. runtimeDecls
is
used in this initRuntime
which is pretty self explanatory:
If you go to ssa.go
and search for Syms.TypeAssert
, you should arrive in the
midst of the dottype1
function, where the compiler generates one of 3 calls
(each to one of the aforementioned functions from iface.go
):
Let's dive into these 3 code paths. First, what's this commaok
? Recall that
assertE2I
panics on failure, whereas assertE2I2
returns nil. So if commaok == false
, the type assert ends up panicking at runtime? Aha the commaok
must
mean the str, ok := obj.(string)
pattern in Go.
If you trace further to see where the value of commaok
was first set, you'll
see that it comes from either dottype
or dynamicDottype
-- the name should
make sense now, it's referring to x.(type)
in Go, and the "dynamic" part is
for generics x.(T)
. If you check the gitblame for dynamicDottype
, you'll see
the commit in which Keith
Randall implemented generic x.(T)
operation. If you are like me and think "hmm
this suggests Go's generics is not like C++ or Rust generics, where compiler
emits new and specialized code upon generics instantiation", you are like 50%
right. I'll write another blog on generics implementations.
Let's look at the places dottype
and dynamicDottype
are used:
So what are these OAS2DOTTYPE
ir constructs? Thankfully they are somewhat
documented:
It should be pretty obvious why only OAS2DOTTYPE
corresponds to commaok == true
. The else branch for OAS2DOTTYPE
might look sketchy, but a bit more
digging will lead you to this:
Great, commaok
makes sense now, what about descriptor
? It also had some
documentations:
Okay, that sounds like some generics related thing again, i.e., if we are
type asserting to a generic type, descriptor == nil
? Well only
the dynamicDottype
code path has descriptor
explicitly set to nil
, so I
think that's a fair conjecture. But why do we call typeAssert
at runtime in
this path? Well, Keith added some low level caching
optimization, and I'm not
looking into that because there's enough derailing.
In Go type switch is syntactic + performance sugar for a series of type assertions.
In the compiler source, this is the only other place getitab
gets called,
nothing fancy:
Again, in ssa.go
we find this interfaceSwitch
function registered as
ir.Syms.InterfaceSwitch
, and used in a single place; it's pretty intuitive.
A bit more looking on the IR types you should arrive at the intermediate lowering for type switch. I won't go into all the details, but you'll find interesting things like separating concrete v.s. interface cases, and special handling for generics. The initial version of generic type switch is added in this commit.
If you are not lost yet, you might remember why we started this excursion -- how Go constructs the itables, and how it optimizes for certain static cases. We've seen the former, now let's look for the optimizations (i.e., when target type is not an interface).
Intuitively, if the target type is not an interface type (e.g., we are casting
from an interface to its underlying concrete type via interface.(Type)
), we
can pre-compute the itable for (type, interface)
at compile time, then load
the itable from the interface and do a simple pointer equality check, since each
(type, interface)
pair has a unique itable.
Indeed, this is exactly what happens in dottype1
, first note that in the Type Assert
section, we've been looking at a specific branch of dottype1
:
Outside that branch, which corresponds to target type being a non-interface, we
see exactly what we conjectured above, also note the documentation for
targetItab
:
Now the question is, where was targetItab
first assigned? If you look at the 2
call sites of dottype1
, you'll see that it's from n.ITab
:
If you run git grep "\.ITab ="
, you'll get a few matches, but I think the most
relevant (and decipherable) ones are these, which processes the regular and
generic dot type expressions (e.g., x.(T)
).
Specifically, since we only care about the case where target type is a
compile-time non-interface, I think we can safely ignore walkDynamicDotType
and instead focus on the reflectdata.ITabAddrAt
function called in
walkDotType
. Its definition is pretty simple:
If you trace the lsym := s.Linksym()
call, you'll end up seeing these things,
I don't understand them, but to me they are saying "emitting some itable symbol
to the object file, which serve as static data that'll be read by the optimized
type assert expressions"
And if you trace the writeITab
call, you'll see more details about the actual
itable emission, and a helper that generates symbols for the methods.
I'll stop here. If you are interested in the historical development of these features and optimizations, I recommend playing with the git blames, I ended up finding a lot of interesting conversations and old PRs, back when things were a bit simpler.
After poking around in the Go compiler source code, I thought about Java and JVM, because I vaguely recalled that JVM doesn't have this fat pointer thing like Go, Rust, etc. So how is dynamic dispatch implemented at the JVM level?
I cloned the openjdk source at commit 6d5699617ff, and did some cursory search. I'm mainly interested in what happens when we have an interface method call, because in such case compiler has no clue what the underlying object type is, so we either have a fat pointer like Go, or do some slow method lookup from the underlying type's interface tables.
I happen to remember the relevant bit of JVM bytecode -- invokeinterface, so I just searched for it in the codebase. There are a lot of matches, but with a bit of searching I found the interpreter implementation:
I kinda lost the energy to look further, but we can still see that
invokeinterface
involves some itable check when the method is fully virtual:
On the other hand, for the other invoke...
bytecode, even the virtual case is
a lot simpler than the above
This feels a bit slow. Maybe there are plenty of optimizations like jit or de-virtualization which will help performance. Although not using fat pointer means that upcasting to an interface should be a no-op in Java, but shouldn't method call happens a lot more frequently at runtime, than these upcasts?
Here are some additional articles I enjoyed reading, while investigating this dynamic dispatch topic.