Go maps vs switches

March 4, 2021 · go benchmarking

Sometimes, things aren’t faster because you think it is,.. but because you benchmarked them. One of Go’s nice things is that it makes it easy to benchmark things to see if your hunches are correct quickly. And sometimes, they turn out not.

Switch vs. maps

Once in a while, I need to map a string, int, or another type to something else. This usually turns out to become a switch statement where I map each element to something else. It works fine for a small number of elements, but a large list of case statements isn’t very readable as soon as you have a lot. It also increases your gocyclo complexity.

{% highlight go %} var message switch (status) { case 200: message = “ok” case 404: message = “not found” // insert a lot of extra elements } {% endhighlight %}

So I tend to use maps instead. That way, I have a small amount of code plus a readable map that doesn’t suffer for readability:

{% highlight go %} var httpCodes map[int]string{ 200: “ok”, 404: “not found”, // much cleaner }

message, ok := httpCodes[status]
if !ok {
   // maybe set something default or return error
}

{% endhighlight %}

Since both are lookup tables, you would expect both to be pretty similar in performance. Or at least, so I thought. Let’s see what happens when we do a benchmark on both:

{% highlight go %} package main

import ( “testing” )

var httpStatus = map[int]string{ 200: “ok”, 404: “not found”, }

var result = make([]string, 3)

func doMap(code int) string { status, ok := httpStatus[code] if !ok { status = "" }

return status

}

func doSwitch(code int) string { var message = ""

switch (code) {
case 200:
	message = "ok"
case 404:
	message = "not found"
}

return message

}

func BenchmarkSwitch(b *testing.B) { r := make([]string, 3)

for n:=0; n<b.N; n++ {
	r[0] = doSwitch(200)
	r[1] = doSwitch(404)
	r[2] = doSwitch(999)
}

result = r

}

func BenchmarkMap(b *testing.B) { r := make([]string, 3)

for n:=0; n<b.N; n++ {
	r[0] = doMap(200)
	r[1] = doMap(404)
	r[2] = doMap(999)
}

result = r

} {% endhighlight %}

Running it with go test --bench=. gives us the following result:

{% highlight text %} $ go test –bench=. goos: linux goarch: amd64 BenchmarkSwitch-12 808913143 1.47 ns/op BenchmarkMap-12 80380466 14.5 ns/op PASS {% endhighlight %}

There was an issue that it seems that Go had optimized the whole testing code out, resulting in a way too low ns/op. I’ve added some additional code that should not optimize the functionality we want to test out.

It seems that the map lookup is almost 10 times slower (still fast, though). I’ve compiled the code to assembly to figure out what’s going on. It seems that the switch is fast because it is not directly a jump-table (as I expected), but a linked list (is the code 200, no goto next check etc).

{% highlight text %} 0x0000 00000 (map_test.go:26) MOVQ “".code+8(SP), AX 0x0005 00005 (map_test.go:26) CMPQ AX, $200 0x000b 00011 (map_test.go:26) JNE 36 0x000d 00013 (map_test.go:26) PCDATA $0, $1 0x000d 00013 (map_test.go:26) LEAQ go.string.“ok”(SB), AX 0x0014 00020 (map_test.go:26) MOVL $2, CX 0x0019 00025 (map_test.go:32) PCDATA $0, $0 0x0019 00025 (map_test.go:32) PCDATA $1, $1 0x0019 00025 (map_test.go:32) MOVQ AX, “”.~r1+16(SP) 0x001e 00030 (map_test.go:32) MOVQ CX, “”.~r1+24(SP) 0x0023 00035 (map_test.go:32) RET 0x0024 00036 (map_test.go:28) PCDATA $1, $0 0x0024 00036 (map_test.go:28) CMPQ AX, $404 0x002a 00042 (map_test.go:28) JNE 58 0x002c 00044 (map_test.go:28) PCDATA $0, $1 0x002c 00044 (map_test.go:28) LEAQ go.string.“not found”(SB), AX 0x0033 00051 (map_test.go:28) MOVL $9, CX 0x0038 00056 (map_test.go:25) JMP 25 0x003a 00058 (map_test.go:25) XORL AX, AX 0x003c 00060 (map_test.go:25) XORL CX, CX 0x003e 00062 (map_test.go:25) JMP 25 {% endhighlight %}

However, the map takes a lot more time. This is probably because it calls runtime.mapaccess2_fast64(SB), which might still be a fast access call, but it is still a call and compared to a simple compare in the switch code quite slow.

Mapping from string to int

So let’s turn things around: instead of mapping ints to strings. Let’s map the string to an int instead.

{% highlight go %} package main

import ( “testing” )

var httpStatus = map[string]int{ “ok”: 200, “not found”: 404, }

func doMap(message string) int { code, ok := httpStatus[message] if !ok { code = 0 }

return code

}

var result = make([]int, 3)

func doSwitch(message string) int { var code int

switch (message) {
case "ok":
	code = 200
case "not found":
	code = 404
}

return code

}

func BenchmarkSwitch(b *testing.B) { r := make([]int, 3)

for n:=0; n<b.N; n++ {
	r[0] = doSwitch("ok")
	r[1] = doSwitch("not found")
	r[2] = doSwitch("not implemented")
}

result = r

}

func BenchmarkMap(b *testing.B) { r := make([]int, 3)

for n:=0; n<b.N; n++ {
	r[0] = doMap("ok")
	r[1] = doMap("not found")
	r[2] = doMap("not implemented")
}

result = r

} {% endhighlight %}

And the result:

{% highlight text %} goos: linux goarch: amd64 BenchmarkSwitch-12 697366684 1.67 ns/op BenchmarkMap-12 100000000 11.4 ns/op PASS ok command-line-arguments 2.498s {% endhighlight %}

It seems that both functions perform pretty much the same.

When compiled to assembler, we see that the map functionality compiles into pretty much identical code, except now we have a call to runtime.mapaccess2_faststr. The switch statement has changed too. Instead of comparing simple ints, it first checks the length of a given string against the length of the string we match (so in the first case, it checks if the length of message is 2). If this isn’t equal, there is no point in checking the rest of the string, but if it does match, it will continuously load 8 bytes of string and compare it against the message (8 bytes, since that what can fit in the 64-bit register). Still, it’s a fast process because of the fast-failures in the system.

So which one should we be using?

So it looks like the switch wins, but it seems to have aO(N) complexity because of the iteration overall cases, while the map has an O(1) complexity. It doesn’t necessarily mean that the map is better than the switch, since we need to consider the number of elements (it takes a while before the O(1) is faster), and of course, readability is an issue. It depends on the situation what you would use, and in some cases, I will go for the maps, and other cases for the switch. But if speed is your concern, benchmark your stuff!