Go随笔 | map泛滥的世界 | 闪电侠的小笔记

开发中经常需要用到map数据类型，主流编程语言也都实现了类似的功能，比如叫哈希、散列、map等等，实现数据结构都类似，性能不相上下。网上有很多关于map实现解析的文章。参考：

https://www.jianshu.com/p/0a777dc7f7ae

https://draveness.me/golang/docs/part2-foundation/ch03-datastructure/golang-hashmap

大家可能注意到，map使用过程中可能会根据键值对的增减而发生动态伸缩，而伸缩的过程是重新申请内存重建map的过程，旧的对象就需要进入GC回收。明显频繁使用map会造成性能的下降。究竟影响有多大呢？

map和数组存取比较

map主要是为快速检索对象设计的。在Web开发中几乎都是用map来收集提交的参数；在这种场景下使用map检索值和直接使用字符串比较检索值有差别吗，下面做了个测试。

模拟表单提交

假设某个Web页面请求提交的参数字段是确定的；我们分别用map和数组来实现对Web表单值的存储，最后再循环遍历所有值，尽量模拟业务开发中的场景。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49


package hashx

var webFields []string
// 初始化Web请求的Fields
func InitData(fields []string) {
	webFields = fields
}

// 模拟用map对象装填Web请求的所有Fields，假设K和V都是同样的字符串
func SearchFromHash() int {
	pms := make(map[string]string, 8)
	for i := range webFields {
		pms[webFields[i]] = webFields[i]
	}

    // 模拟遍历所有字段，简单求每个值第一个字符的ascii码之和
	total := 0
	for k := range pms {
		total += int(pms[k][0])
	}
	return total
}

func SearchFromArray() int {
	arr := make([]string, len(webFields))
	pms := &arr

    // 模拟提交的参数值，放入对应的数组中，我们不用hash函数，直接字符串比较检索字段
	for i := range webFields {
		for j := range webFields {
			if webFields[i] == webFields[j] {
				(*pms)[i] = webFields[i]
				break
			}
		}
	}

	total := 0
    // 字符串比较检索值，并求第一个字符ascii码之和
	for i := range webFields {
		for j := range webFields {
			if webFields[i] == webFields[j] {
				total += int((*pms)[i][0])
				break
			}
		}
	}
	return total
}

对上面的两种写法做性能测试：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33


package hashx

import (
	"math/rand"
	"runtime"
	"testing"
	"time"
)

var GoroutinesNum = 5000 // 测试时候的并发数
var keys []string

func init() {
	runtime.GOMAXPROCS(8)
    // 假设Web表单请求，提交了下面5个字段
	keys = []string{"user_name", "user_age", "created_at", "updated_at", "open_time"}
	InitData(keys)
}

// ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
func BenchmarkHashString(b *testing.B) {
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		SearchFromHash()
	}
}

func BenchmarkArrayString(b *testing.B) {
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		SearchFromArray()
	}
}

基准测试：go test -bench=String$ -benchmem

如果字段少于8个，keys = []string{“user_name”, “user_age”, “created_at”, “updated_at”, “open_time”} 测试结果如下：

cpu: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
BenchmarkHashString-8            8109505               148.0 ns/op             0 B/op          0 allocs/op
BenchmarkArrayString-8           9125210               128.3 ns/op            80 B/op          1 allocs/op

如果字段刚好超过8个，9个的时候：keys = []string{“user_name”, “user_age”, “created_at”, “updated_at”, “open_time”, “address”, “toys”, “status”, “mobile”} 测试结果如下：

cpu: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
BenchmarkHashString-8            1648086               736.5 ns/op           577 B/op          1 allocs/op
BenchmarkArrayString-8           4992355               223.1 ns/op           144 B/op          1 allocs/op

如果字段不止8个，24个的时候呢：keys = []string{“user_name”, “user_age”, “created_at”, “updated_at”, “open_time”, “address”, “toys”, “status”, “mobile”, “phone”, “age”, “date”, “user_name2”, “user_age2”, “created_at2”, “updated_at2”, “open_time2”, “address2”, “toys2”, “status2”, “mobile2”, “phone2”, “age2”, “date2”} 测试结果如下：

cpu: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
BenchmarkHashString-8             535888              2234 ns/op            1944 B/op          2 allocs/op
BenchmarkArrayString-8           1287135               923.6 ns/op           384 B/op          1 allocs/op

分析：

map单个桶最多装8个值，而局部变量一般分配在执行栈中，因此参数不大于8个的时候，申请的局部变量pms能装下所有参数，内容全部在栈内存中，因此能看到性能测试没有内存分配和回收；map整体性能还不错。即便如此，map生成和检索的性能还不如直接字符串遍历比较检索值来的快。
map值超过8个时，会出现溢出桶；于是有内存需要动态分配并回收；map性能下降明显。数组模式也因为循环字符串比较变多而出现性能下降。
map值进一步增加时，意味着更多溢出桶，甚至出现map的自动伸缩，性能下降很多。数组模式因为循环比较性能下降不少。
数组模式字符串循环比较其实还是有一定性能优化空间的，比如前缀树快速定位字段。

结论：

map作为函数局部变量，初始化时因为大小固定能放8组KV，因此编译器优化将对象分配在栈中。
map除了能随意放入KV，在KV量不是特别大的时候，其哈希检索性能还不如遍历字符串检索值。
map隐形中带来堆分配，并随着装载量的增减而伸缩，带来GC增多，同时性能严重下降。

sync.Pool提高性能

map对象由于无法重置项目值，只能delete所有key，一般map不方便用sync.Pool缓存并重用。而数组中的值可以遍历重置，可以用sync.Pool提高性能。上面测试代码做如下的小改造：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39


var arrValuePool sync.Pool

func InitData(fields []string) {
	webFields = fields

    // 数组对象缓存池
	arrValuePool.New = func() any {
		arr := make([]string, len(fields))
		return &arr
	}
}

func SearchFromArray() int {
	pms := arrValuePool.Get().(*[]string) // 内存对象缓存共享
	for i := range *pms {
		(*pms)[i] = ""
	}

	for i := range webFields {
		for j := range webFields {
			if webFields[i] == webFields[j] {
				(*pms)[i] = webFields[i]
				break
			}
		}
	}

	total := 0
	for i := range webFields {
		for j := range webFields {
			if webFields[i] == webFields[j] {
				total += int((*pms)[i][0])
				break
			}
		}
	}
	arrValuePool.Put(pms) // 用完一定要回收
	return total
}

slice放入Pool，测试结果如下：

# 输入5个字段
BenchmarkHashString-8            8003494               151.5 ns/op             0 B/op          0 allocs/op
BenchmarkArrayString-8          16711183                70.54 ns/op            0 B/op          0 allocs/op

# 输入9个字段
BenchmarkHashString-8            1702929               697.7 ns/op           577 B/op          1 allocs/op
BenchmarkArrayString-8           7204788               165.8 ns/op             0 B/op          0 allocs/op

# 输入24个字段
BenchmarkHashString-8             547135              2162 ns/op            1944 B/op          2 allocs/op
BenchmarkArrayString-8           1419124               775.0 ns/op             0 B/op          0 allocs/op

结论：

sync.Pool的确能够有效复用对象，减轻GC压力，提高性能。

高并发测试

在保持数组模式用sync.Pool缓存的情况下，我们给单个核5000个并发压测一下看看效果

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29


func BenchmarkHashStringBF(b *testing.B) {
	rand.Seed(time.Now().UnixNano())
	b.ReportAllocs()
	b.ResetTimer()

	// 并发测试模式
	b.SetParallelism(GoroutinesNum)
	b.StartTimer()
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			SearchFromHash()
		}
	})
}

func BenchmarkArrayPoolStringBF(b *testing.B) {
	rand.Seed(time.Now().UnixNano())
	b.ReportAllocs()
	b.ResetTimer()

	// 并发测试模式
	b.SetParallelism(GoroutinesNum)
	b.StartTimer()
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			SearchFromArray()
		}
	})
}

高并发，测试结果如下：

# 输入5个字段
BenchmarkHashStringBF-8         21547699                52.41 ns/op            0 B/op          0 allocs/op
BenchmarkArrayPoolStringBF-8    44316729                23.05 ns/op            0 B/op          0 allocs/op

# 输入9个字段
BenchmarkHashStringBF-8          3531322               408.3 ns/op           578 B/op          1 allocs/op
BenchmarkArrayPoolStringBF-8     15419502              118.7 ns/op             0 B/op          0 allocs/op

# 输入24个字段
BenchmarkHashStringBF-8          1258896               884.2 ns/op          1947 B/op          2 allocs/op
BenchmarkArrayPoolStringBF-8     4403827               270.5 ns/op             1 B/op          0 allocs/op

结论：

Go性能的确好，高并发场景，能高效利用CPU，提高吞吐效率。
高并发场景，进一步体现sync.Pool的性能优势。

map也能用sync.Pool缓存复用

看看下面的试验，我们用sync.Pool缓存map对象，复用的时候每次循环重置所有值（当然这种办法复用对象有不严谨的地方）：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35


// 将map对象放入sync.Pool，复用对象看看性能
var hashValuePool sync.Pool

func InitData(fields []string) {
	hashValuePool.New = func() any {
		return make(map[string]string, len(fields))
	}
}

func SearchFromHashPool() int {
	pms := hashValuePool.Get().(map[string]string)
    // 模拟重置每一项值
	for k := range pms {
		pms[k] = ""
	}

	for i := range webFields {
		pms[webFields[i]] = webFields[i]
	}

	total := 0
	for k := range pms {
		total += int(pms[k][0])
	}
	hashValuePool.Put(pms)
	return total
}

// 加一个HashPool的基准测试
func BenchmarkHashPoolString(b *testing.B) {
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		SearchFromHashPool()
	}
}

map放入Pool，测试结果如下：

# 输入5个字段
BenchmarkHashString-8            8109230               147.8 ns/op             0 B/op          0 allocs/op
BenchmarkHashPoolString-8        4668223               252.1 ns/op             0 B/op          0 allocs/op
BenchmarkArrayPoolString-8      16450185                71.01 ns/op            0 B/op          0 allocs/op

# 输入9个字段
BenchmarkHashString-8            1684905               706.9 ns/op           577 B/op          1 allocs/op
BenchmarkHashPoolString-8        2490994               456.8 ns/op             0 B/op          0 allocs/op
BenchmarkArrayPoolString-8       7936622               150.1 ns/op             0 B/op          0 allocs/op

# 输入24个字段
BenchmarkHashString-8             524149              2169 ns/op            1944 B/op          2 allocs/op
BenchmarkHashPoolString-8         999249              1238 ns/op               0 B/op          0 allocs/op
BenchmarkArrayPoolString-8       1522460               780.4 ns/op             0 B/op          0 allocs/op

分析：

map在局部变量，内存分配在栈中。此时用Pool，增加管理的损耗，性能反而下降。
map存在扩容，伸缩等场景下；Pool能起到缓存对象，复用内存提升性能的作用。
KV数量不大时，比如三五十个项以内，map的性能没有循环匹配字符串来的快。

总结

在Go项目中，随处可见map的使用；甚至有滥用的嫌疑。不知道大家对map运行细节是否都了然于胸，这个类型使用场景多，方便好用，但稍不留意，很容易写出性能低下的代码。希望大家引起注意。

大结论：通常，保存检索50个以下键值对的场景；相比用map，用数组实现大概率更省内存，性能更好。

我想直接循环遍历比较每个字符串肯定还有优化的空间，待后面再好好想想办法。

（完）

系列文章导航

Go随笔 | map泛滥的世界

map和数组存取比较

模拟表单提交

sync.Pool提高性能

高并发测试

map也能用sync.Pool缓存复用

总结

See Also

分类

标签

最近文章

其它