grpc-gateway的替代品--Turbo

zzxx513 发表了文章 • 0 个评论 • 354 次浏览 • 2017-09-15 01:06 • 来自相关话题

转载自:https://zhuanlan.zhihu.com/p/29350695

grpc-gateway是一个使用起来很便捷的工具,... 查看全部

转载自:https://zhuanlan.zhihu.com/p/29350695


grpc-gateway是一个使用起来很便捷的工具,可以很方便的把grpc接口用HTTP的方式暴露出去。


但在实际使用过程中,也在grpc-gateway里发现了一些问题,比如:


1,灵活性不够,如果有一些比较特殊的需求,在grpc-gateway中能扩展的余地不大;


2,严重依赖protocol buffer,而且必须是protobuf 3;


3,即使grpc服务的接口不变,只是修改HTTP接口定义,也必须重新生成代码,也就必须重新部署,重启服务;


4,只支持JSON格式的输入,不支持传统的kv格式的参数;


5,只支持grpc,嗯。。好吧,这不算问题,但thrift也很普及,是不是?


6,grpc-gateway在错误处理等方面都不够成熟,而且开发者似乎也不是很活跃。。。


Turbo努力解决了上面提到的问题,这是项目的地址:


vaporz/turbo


这是文档地址,很贴心很详细,中英双语哦~


Turbo Documentation


除了提供基本的与grpc-gateway类似的HTTP代理功能,Turbo还可以做到:


1,高度灵活,提供各种基于切面(不是吃的那个“切面”)思想的组件,可以在各个环节进行定制;


2,只依赖grpc,对protocol buffer没有要求,因此,你既可以使用protobuf2,也可以使用protobuf3;


3,HTTP接口的定义,以及与后端接口之间的映射,可以在运行时直接修改,并且立即生效!


4,不仅支持JSON格式的输入,也支持传统的kv格式的输入!


5,不仅支持grpc,还支持thrift!


6,自带命令行工具,一键创建可运行的项目,一键重新生成代码!


Turbo目前仍处于诞生初期,但现在的代码已经经过了认真的测试,认真细致的测试用例让测试覆盖率达到了98%。


当然,测试覆盖率说明不了多少问题,只有经过实战考验的代码才是可靠的!


因此,欢迎大家多多试用,多多吐槽,有任何建议或想法,请在GitHub上开Issue,坐等。


遇到任何问题,我愿意尽力帮助,尽力解决!


谢谢!

使用反射对绑定url参数到结构体

tanghui 发表了文章 • 1 个评论 • 279 次浏览 • 2017-09-14 00:00 • 来自相关话题

通过反射自动绑定url参数到结构体,同时支持对参数范围进行校验已经参数默认值的设置 example:

 package ma... 			查看全部
					

通过反射自动绑定url参数到结构体,同时支持对参数范围进行校验已经参数默认值的设置
example:


 package main
import (
"net/http"
"fmt"

"github.com/lintanghui/parse"
)
func main(){
type v struct {
Data16 int8 `params:"aaa;Range(1,10)" default:"10"`
Data32 int32
Data64 int64 `params:"data64;Range(1,20)" default:"20"`
Float32 float32 `params:"ccc"`
String string `params:"sss" default:"-"`
SliceInt []int64 `params:"iii"`
SliceStr []string `params:"ttt"`
Bool bool `params:"bbb"`
}
req, err := http.NewRequest("GET", "http://www.linth.top/x?aaa=11&data64=33&Data32=32&string=aaa&iii=1,2,3&ttt=a,b,c&bbb=true&ccc=1.2", nil)
req.ParseForm()
if err != nil {
t.Log(err)
}
p := parse.New()
var data = &v{}
err = p.Bind(data, req.Form)
fmt.Printf("%+v",data)
// OUTPUT:
// &{Data16:10 Data32:32 Data64:20 Float32:1.2 String: SliceInt:[1 2 3] SliceStr:[a b c] Bool:true}
}

https://github.com/lintanghui/parse

[开源]golang123 是使用 vue、nuxt、node.js 和 golang 开发的社区系统

shen100 发表了文章 • 5 个评论 • 597 次浏览 • 2017-09-13 17:42 • 来自相关话题

golang123 是一个开源的社区系统,界面优雅,功能丰富,小巧迅速。 已在golang中文社区 得到应用,你完全可以用它来搭建自己的社区。

golang... 查看全部

golang123 是一个开源的社区系统,界面优雅,功能丰富,小巧迅速。
已在golang中文社区 得到应用,你完全可以用它来搭建自己的社区。


golang123的技术架构是前后端分离的, 前端使用vueiviewnode.jsnuxt等技术来开发, 后端使用gogormiris等技术来开发。golang123的技术选型也是超前的, 我们大胆得使用nuxt来做前后端同构渲染



golang123目前正处于活跃的开发中,预计十月初发布Beta版



社区首页



Golang123线上机器配置



















CPU 1核
内存 512 MB
带宽 1Mbps

安装


依赖的软件



























软件 版本
node.js 8.4.0 (及以上)
golang 1.9 (及以上)
mysql 5.6.35 (及以上)
redis 4.0.1 (及以上)

克隆代码


golang123的代码克隆到gopath的src目录下,即your/gopath/src/golang123


前端依赖的模块


进入golang123/website目录,输入命令


npm install

如果安装失败,或速度慢,可尝试阿里的镜像


npm install --registry=https://registry.npm.taobao.org

后端依赖的库


//iris web框架
go get -u github.com/kataras/iris

//gorm 持久层框架
go get -u github.com/jinzhu/gorm

//redis客户端工具
go get github.com/garyburd/redigo/redis

//uuid生成工具
go get github.com/satori/go.uuid

//防XSS攻击
go get -u github.com/microcosm-cc/bluemonday

//markdown解析器
go get github.com/russross/blackfriday

配置


hosts


127.0.0.1 dev.golang123.com


nginx



  1. golang123/nginx/dev.golang123.com.example.conf文件改名为dev.golang123.com.conf,然后拷贝到nginx的虚拟主机目录下

  2. golang123/nginx/server.keygolang123/nginx/server.crt拷贝到某个目录下

  3. 打开nginx的虚拟主机目录下的dev.golang123.com.conf文件,然后修改访问日志和错误日志的路径,即修改access_log和error_log。

  4. 修改证书路径为server.key和server.crt所在的路径,即修改ssl_certificate和ssl_certificate_key


请参考如下配置中请修改标记的地方:


server {
listen 80;
server_name dev.golang123.com;

access_log /path/logs/golang123.access.log; #请修改
error_log /path/logs/golang123.error.log; #请修改

rewrite ^(.*) https://$server_name$1 permanent;
}

server {
listen 443;
server_name dev.golang123.com;

access_log /path/logs/golang123.access.log; #请修改
error_log /path/logs/golang123.error.log; #请修改

ssl on;
ssl_certificate /path/cert/golang123/server.crt; #请修改
ssl_certificate_key /path/cert/golang123/server.key; #请修改

...

}

前端配置


golang123/website/config/index.example.js文件重命名为index.js


后端配置


golang123/config.example.json文件重命名为config.json,然后修改以下配置:



  1. 修改mysql连接地址及端口

  2. 修改mysql的用户名及密码

  3. 修改redis的连接地址及端口

  4. 修改域名邮箱的用户名及密码(golang123使用的是QQ域名邮箱)


运行


运行前端项目


进入golang123/website目录,然后运行


npm run dev

运行后端项目


进入golang123目录,然后运行


go run main.go

访问


浏览器中访问 https://dev.golang123.com/


问题


有任何问题或建议都欢迎提 issue


技术交流


QQ群: 32550512

QQ群中的消息很可能还没被看到,就被后来的消息给冲掉,有问题还是建议大家开issue


赞助


如果你觉得golang123这个项目还不错的话,可以通过扫描下面的二维码来赞助我, 金额任意,无上限 ^v^



License


GPL

Copyright (c) 2013-present, shen100

x86/x64 汇编语言练习工具

sheepbao 回复了问题 • 7 人关注 • 6 个回复 • 560 次浏览 • 2017-09-11 15:05 • 来自相关话题

RobotGo v0.46.0 发布, 修复重要 bug

够浪 回复了问题 • 3 人关注 • 3 个回复 • 542 次浏览 • 2017-09-03 23:53 • 来自相关话题

filebeat 在源码上区分Windows 和Linux系统吗

回复

KSpeer 回复了问题 • 1 人关注 • 1 个回复 • 434 次浏览 • 2017-09-01 14:43 • 来自相关话题

高性能的 encoding/json 替代品 jsoniter 发布 1.0.0

taowen 发表了文章 • 6 个评论 • 399 次浏览 • 2017-08-31 00:08 • 来自相关话题

https://github.com/json-iterator/go 完全实现 encoding/json 的所有行为,但是速度要快很多。同时通过ext... 查看全部

https://github.com/json-iterator/go 完全实现 encoding/json 的所有行为,但是速度要快很多。同时通过extension机制提供了更强大的扩展能力。发布至今已经突破了 1000 star。在 Kubernetes 团队成员的催促下(https://github.com/json-iterator/go/issues/154 ),今天发布了 1.0.0 版本,希望能够被合并到 Kubernetes 的代码里。


jsoniter 诞生于滴滴出行平台技术部的实际业务需求之中。如果你也想加入我们,做有挑战的 Go 应用开发,请加我的微信:nctaowen。

Go 语言在命令行以表格的形式输出结构体切片

modood 发表了文章 • 3 个评论 • 373 次浏览 • 2017-08-29 15:32 • 来自相关话题

最近写的小工具,可以在命令行以表格的形式输出结构体切片

  • 没有第三方依赖
  • 支持中文汉字
  • 表格每列自动对齐
  • 支持自动适应列宽
  • 结构体的字段支持所有数据类型(字符... 查看全部

最近写的小工具,可以在命令行以表格的形式输出结构体切片



  • 没有第三方依赖

  • 支持中文汉字

  • 表格每列自动对齐

  • 支持自动适应列宽

  • 结构体的字段支持所有数据类型(字符串,切片,映射等)


例如可以很方便清晰地将数据库查询结果列表(结构体切片)在命令行以表格的形式输出。


项目 Github 主页:https://github.com/modood/table


对你有用的话,给个 star 支持一下吧~


package main

import (
"fmt"

"github.com/modood/table"
)

type House struct {
Name string
Sigil string
Motto string
}

func main() {
s := []House{
{"Stark", "direwolf", "Winter is coming"},
{"Targaryen", "dragon", "Fire and Blood"},
{"Lannister", "lion", "Hear Me Roar"},
}

table.Output(s)
}

输出结果:


┌───────────┬──────────┬──────────────────┐
│ Name │ Sigil │ Motto │
├───────────┼──────────┼──────────────────┤
│ Stark │ direwolf │ Winter is coming │
│ Targaryen │ dragon │ Fire and Blood │
│ Lannister │ lion │ Hear Me Roar │
└───────────┴──────────┴──────────────────┘

beats 组件的开发

duanquanyong 回复了问题 • 2 人关注 • 1 个回复 • 487 次浏览 • 2017-08-28 12:50 • 来自相关话题

Golang API 业务监控项目求大神指点

haoweishow 回复了问题 • 3 人关注 • 2 个回复 • 1049 次浏览 • 2017-08-22 09:20 • 来自相关话题

请问有没有比较好的分布式系统监控项目?

fiisio 回复了问题 • 10 人关注 • 4 个回复 • 1243 次浏览 • 2017-08-21 10:41 • 来自相关话题

Go 语言编写轻量级网络库,GrapeNet

koangel 发表了文章 • 0 个评论 • 525 次浏览 • 2017-08-20 11:36 • 来自相关话题

简介(Introduction)

Go语言编写轻量级网络库 (grapeNet is a lightweight and Easy Use Network Framework)

可用于游戏服务端、强网络服务器端或其他类似应... 查看全部

简介(Introduction)


Go语言编写轻量级网络库 (grapeNet is a lightweight and Easy Use Network Framework)


可用于游戏服务端、强网络服务器端或其他类似应用场景,每个模块单独提取并且拥有独立的使用方法,内部耦合性较轻。


其实GO语言曾经有过很多强架构的框架,比如GOWOLRD之类的,已经足够了,但是我会将库用于各种轻量级应用不需要过于复杂的内容,所以我设计了GrapeNet,目的是模块独立化。 你可以拆开只使用其中很小的模块,也可以组合成一个服务端,并且在架构中设计也较为轻松,至于热更新的问题,目前脚本数据支持热更新,并且是自动的,只要跑一下UPDATE即可,程序本身稍后测试后发布(仅支持LINUX)。


本库更像是一个日常服务端开发的轻量级工具库集合,用的开心噢。


慢慢更新中,很多坑要填,目前暂不适合用于商业项目。


个人博客:http://grapec.me/



安装


go get -u github.com/koangel/grapeNet...


模块表(Function)



  • Lua脚本绑定管理(可绑定任何类型的函数、线程安全且自动推倒类型)

  • 日志库(底层采用Seelog)

  • 函数管理系统(可以根据任何类型参数将其与函数绑定并互相调用)

  • 流处理

  • Tcp网络

  • Websocket网络 (基础版)

  • Codec(任意类型注册对象并在其他位置动态创建该对象)

  • CSV序列化模块(通过Tag可以直接序列化到对象或对象序列化为CSV)


依赖第三方



  • Seelog (github.com/cihub/seelog)

  • Gopher-lua(github.com/yuin/gopher-lua)

  • Gopher-luar(layeh.com/gopher-luar)


不依赖任何CGO内容,lua本身也是纯GO实现。

Introducing Badger: A fast key-value store written purely in Go

chenxu 发表了文章 • 0 个评论 • 298 次浏览 • 2017-08-18 17:52 • 来自相关话题

We have built an efficient and persistent log structured mer... 查看全部


We have built an efficient and persistent log structured merge (LSM) tree based key-value store, purely in Go language. It is based upon WiscKey paper included in USENIX FAST 2016. This design is highly SSD-optimized and separates keys from values to minimize I/O amplification; leveraging both the sequential and the random performance of SSDs.


We call it Badger. Based on benchmarks, Badger is at least 3.5x faster than RocksDB when doing random reads. For value sizes between 128B to 16KB, data loading is 0.86x - 14x faster compared to RocksDB, with Badger gaining significant ground as value size increases. On the flip side, Badger is currently slower for range key-value iteration, but that has a lot of room for optimization.


Background and Motivation


Word about RocksDB


RocksDB is the most popular and probably the most efficient key-value store in the market. It originated in Google as SSTable which formed the basis for Bigtable, then got released as LevelDB. Facebook then improved LevelDB to add concurrency and optimizations for SSDs and released that as RocksDB. Work on RocksDB has been continuously going on for many years now, and it’s used in production at Facebook and many other companies.


So naturally, if you need a key-value store, you’d gravitate towards RocksDB. It’s a solid piece of technology, and it works. The biggest issue with using RocksDB is that it is written in C++; requiring the use of Cgo to be called via Go.


Cgo: The necessary evil


At Dgraph, we have been using RocksDB via Cgo since we started. And we’ve faced many issues over time due to this dependency. Cgo is not Go, but when there are better libraries in C++ than Go, Cgo is a necessary evil.


The problem is, Go CPU profiler doesn’t see beyond Cgo calls. Go memory profiler takes it one step further. Forget about giving you memory usage breakdown in Cgo space, Go memory profiler fails to even notice the presence of Cgo code. Any memory used by Cgo would not even make it to the memory profiler. Other tools like Go race detector, don’t work either.


Cgo has caused us pthread_create issues in Go1.4, and then again in Go1.5, due to a bug regression. Lightweight goroutines become expensive pthreads when Cgo is involved, and we had to modify how we were writing data to RocksDB to avoid assigning too many goroutines.


Cgo has caused us memory leaks. Who owns and manages memory when making calls is just not clear. Go, and C are at the opposite spectrums. One doesn’t let you free memory, the other requires it. So, you make a Go call, but then forget to Free(), and nothing breaks. Except much later.


Cgo has given us a unmaintainable code. Cgo makes code ugly. The Cgo layer between RocksDB was the one piece of code no one in the team wanted to touch.


Surely, we fixed the memory leaks in our API usage over time. In fact, I think we have fixed them all by now, but I can’t be sure. Go memory profiler would never tell you. And every time someone complains about Dgraph taking up more memory or crashing due to OOM, it makes me nervous that this is a memory leak issue.


Huge undertaking


Everyone I told about our woes with Cgo, told me that we should just work on fixing those issues. Writing a key-value store which can provide the same performance as RocksDB is a huge undertaking, not worth our effort. Even my team wasn’t sure. I had my doubts as well.


I have great respect for any piece of technology which has been iterated upon by the smartest engineers on the face of the planet for years. RocksDB is that. And if I was writing Dgraph in C++, I’d happily use it.



But, I just hate ugly code.



And I hate recurring bugs. No amount of effort would have ensured that we would no longer have any more issues with using RocksDB via Cgo. I wanted a clean slate, and my profiler tools back. Building a key-value store in Go from scratch was the only way to achieve it.


I looked around. The existing key-value stores written in Go didn’t even come close to RocksDB’s performance. And that’s a deal breaker. You don’t trade performance for cleanliness. You demand both.


So, I decided we will replace our dependency on RocksDB, but given this isn’t a priority for Dgraph, none of the team members should work on it. This would be a side project that only I will undertake. I started reading up about B+ and LSM trees, recent improvements to their design, and came across WiscKey paper. It had great promising ideas. I decided to spend a month away from core Dgraph, building Badger.


That’s not how it went. I couldn’t spend a month away from Dgraph. Between all the founder duties, I couldn’t fully dedicate time to coding either. Badger developed during my spurts of coding activity, and one of the team members’ part-time contributions. Work started end January, and now I think it’s in a good state to be trialed by the Go community.


LSM trees


Before we delve into Badger, let’s understand key-value store designs. They play an important role in data-intensive applications including databases. Key-value stores allow efficient updates, point lookups and range queries.


There are two popular types of implementations: Log-structured merge (LSM) tree based, and B+ tree based. The main advantage LSM trees have is that all the foreground writes happen in memory, and all background writes maintain sequential access patterns. Thus they achieve a very high write thoughput. On the other hand, small updates on B+-trees involve repeated random disk writes, and hence are unable to maintain high throughput write workload1.


To deliver high write performance, LSM-trees batch key-value pairs and write them sequentially. Then, to enable efficient lookups, LSM-trees continuously read, sort and write key-value pairs in the background. This is known as a compaction. LSM-trees do this over many levels, each level holding a factor more data than the previous, typically size of Li+1 = 10 x size of Li.


Within a single level, the key-values get written into files of fixed size, in a sorted order. Except level zero, all other levels have zero overlaps between keys stored in files at the same level.


Each level has a maximum capacity. As a level Li fills up, its data gets merged with data from lower level Li+1 and files in Li deleted to make space for more incoming data. As data flows from level zero to level one, two, and so on, the same data is re-written multiple times throughout its lifetime. Each key update causes many writes until data eventually settles. This constitutes write amplification. For a 7 level LSM tree, with 10x size increase factor, this can be 60; 10 for each transition from L1->L2, L2->L3, and so on, ignoring L0 due to special handling.


Conversely, to read a key from LSM tree, all the levels need to be checked. If present in multiple levels, the version of key at level closer to zero is picked (this version is more up to date). Thus, a single key lookup causes many reads over files, this constitutes read amplification. WiscKey paper estimates this to be 336 for a 1-KB key-value pair.


LSMs were designed around hard drives. In HDDs, random I/Os are over 100x slower than sequential ones. Thus, running compactions to continually sort keys and enable efficient lookups is an excellent trade-off.


NVMe SSD Samsung 960 pro


However, SSDs are fundamentally different from HDDs. The difference between their sequential and random reads are not nearly as large as HDDs. In fact, top of the line SSDs like Samsung 960 Pro can provide 440K random read operations per second, with 4KB block size. Thus, an LSM-tree that performs a large number of sequential writes to reduce later random reads is wasting bandwidth needlessly.


Badger


Badger is a simple, efficient, and persistent key-value store. Inspired by the simplicity of LevelDB, it provides Get, Set, Delete, and Iterate functions. On top of it, it adds CompareAndSet and CompareAndDelete atomic operations (see GoDoc). It does not aim to be a database and hence does not provide transactions, versioning or snapshots. Those things can be easily built on top of Badger.


Badger separates keys from values. The keys are stored in LSM tree, while the values are stored in a write-ahead log called the value log. Keys tend to be smaller than values. Thus this set up produces much smaller LSM trees. When required, the values are directly read from the log stored on SSD, utilizing its vastly superior random read performance.


Guiding principles


These are the guiding principles that decide the design, what goes in and what doesn’t in Badger.



  • Write it purely in Go language.

  • Use the latest research to build the fastest key-value store.

  • Keep it simple, stupid.

  • SSD-centric design.


Key-Value separation


The major performance cost of LSM-trees is the compaction process. During compactions, multiple files are read into memory, sorted, and written back. Sorting is essential for efficient retrieval, for both key lookups and range iterations. With sorting, the key lookups would only require accessing at most one file per level (excluding level zero, where we’d need to check all the files). Iterations would result in sequential access to multiple files.


Each file is of fixed size, to enhance caching. Values tend to be larger than keys. When you store values along with the keys, the amount of data that needs to be compacted grows significantly.


In Badger, only a pointer to the value in the value log is stored alongside the key. Badger employs delta encoding for keys to reduce the effective size even further. Assuming 16 bytes per key and 16 bytes per value pointer, a single 64MB file can store two million key-value pairs.


Write Amplification


Thus, the LSM tree generated by Badger is much smaller than that of RocksDB. This smaller LSM-tree reduces the number of levels, and hence number of compactions required to achieve stability. Also, values are not moved along with keys, because they’re elsewhere in value log. Assuming 1KB value and 16 byte keys, the effective write amplification per level is (10*16 + 1024)/(16 + 1024) ~ 1.14, a much smaller fraction.


You can see the performance gains of this approach compared to RocksDB as the value size increases; where loading data to Badger takes factors less time (see Benchmarks below).


Read Amplification


As mentioned above, the size of LSM tree generated by Badger is much smaller. Each file at each level stores lots more keys than typical LSM trees. Thus, for the same amount of data, fewer levels get filled up. A typical key lookup requires reading all files in level zero, and one file per level from level one and onwards. With Badger, filling fewer levels means, fewer files need to be read to lookup a key. Once key (along with value pointer) is fetched, the value can be fetched by doing random read in value log stored on SSD.


Furthermore, during benchmarking, we found that Badger’s LSM tree is so small, it can easily fit in RAM. For 1KB values and 75 million 22 byte keys, the raw size of the entire dataset is 72 GB. Badger’s LSM tree size for this setup is a mere 1.7G, which can easily fit into RAM. This is what causes Badger’s random key lookup performance to be at least 3.5x faster, and Badger’s key-only iteration to be blazingly faster than RocksDB.


Crash resilience


LSM trees write all the updates in memory first in memtables. Once they fill up, memtables get swapped over to immutable memtables, which eventually get written out to files in level zero on disk.


In the case of a crash, all the recent updates still in memory tables would be lost. Key-value stores deal with this issue, by first writing all the updates in a write-ahead log. Badger has a write-ahead log, it’s called value log.


Just like a typical write-ahead log, before any update is applied to LSM tree, it gets written to value log first. In the case of a crash, Badger would iterate over the recent updates in value log, and apply them back to the LSM tree.


Instead of iterating over the entire value log, Badger puts a pointer to the latest value in each memtable. Effectively, the latest memtable which made its way to disk would have a value pointer, before which all the updates have already made their way to disk. Thus, we can replay from this pointer onwards, and reapply all the updates to LSM tree to get all our updates back.


Overall size


RocksDB applies block compression to reduce the size of LSM tree. Badger’s LSM tree is much smaller in comparison and can be stored in RAM entirely, so it doesn’t need to do any compression on the tree. However, the size of value log can grow quite quickly. Each update is a new entry in the value log, and therefore multiple updates for the same key take up space multiple times.


To deal with this, Badger does two things. It allows compressing values in value log. Instead of compressing multiple key-values together, we only compress each key-value individually. This provides the best possible random read performance. The client can set it so compression is only done if the key-value size is over an adjustable threshold, set by default to 1KB.


Secondly, Badger runs value garbage collection. This runs periodically and samples a 100MB size of a randomly selected value log file. It checks if at least a significant chunk of it should be discarded, due to newer updates in later logs. If so, the valid key-value pairs would be appended to the log, the older file discarded, and the value pointers updated in the LSM tree. The downside is, this adds more work for LSM tree; so shouldn’t be run when loading a huge data set. More work is required to only trigger this garbage collection to run during periods of little client activity.


Hardware Costs


But, given the fact that SSDs are getting cheaper and cheaper, using extra space in SSD is almost nothing compared to having to store and serve a major chunk of LSM tree from memory. Consider this:


For 1KB values, 75 million 16 byte keys, RocksDB’s LSM tree is 50GB in size. Badger’s value log is 74GB (without value compression), and LSM tree is 1.7GB. Extrapolating it three times, we get 225 million keys, RocksDB size of 150GB and Badger size of 222GB value log, and 5.1GB LSM tree.


Using Amazon AWS US East (Ohio) datacenter:



  • To achieve a random read performance equivalent of Badger (at least 3.5x faster), RocksDB would need to be run on an r3.4xlarge instance, which provides 122 GB of RAM for $1.33 per hour; so most of its LSM tree can fit into memory.

  • Badger can be run on the cheapest storage optimized instance i3.large, which provides 475GB NVMe SSD (fio test: 100K IOPS for 4KB block size), with 15.25GB RAM for $0.156 per hour.

  • The cost of running Badger is thus, 8.5x cheaper than running RocksDB on EC2, on-demand.

  • Going 1-year term all upfront payment, this is $6182 for RocksDB v/s $870 for Badger, still 7.1x cheaper. That’s a whopping 86% saving.


Benchmarks


Setup


We rented a storage optimized i3.large instance from Amazon AWS, which provides 450GB NVMe SSD storage, 2 virtual cores along with 15.25GB RAM. This instance provides local SSD, which we tested via fio to sustain close to 100K random read IOPS for 4KB block sizes.


The data sets were chosen to generate sizes too big to fit entirely in RAM, in either RocksDB or Badger.




















































Value size Number of keys (each key = 22B) Raw data size
128B 250M 35GB
1024B 75M 73GB
16KB 5M 76GB

We then loaded data one by one, first in RocksDB then in Badger, never running the loaders concurrently. This gave us the data loading times and output sizes. For random Get and Iterate, we used Go benchmark tests and ran them for 3 minutes, going down to 1 minute for 16KB values.


All the code for benchmarking is available in this repo. All the commands ran and their measurements recorded are available in this log file. The charts and their data is viewable here.


Results


In the following benchmarks, we measured 4 things:



  • Data loading performance

  • Output size

  • Random key lookup performance (Get)

  • Sorted range iteration performance (Iterate)


All the 4 measurements are visualized in the following charts. [Badger](<a href=https://github.com/dgraph-io/badger) benchmarks" />


Data loading performance: Badger’s key-value separation design shows huge performance gains as value sizes increase. For value sizes of 1KB and 16KB, Badger achieves 4.5x and 11.7x more throughput than RocksDB. For smaller values, like 16 bytes not shown here, Badger can be 2-3x slower, due to slower compactions (see further work).


Store size: Badger generates much smaller LSM tree, but a larger value size log. The size of Badger’s LSM tree is proportional only to the number of keys, not values. Thus, Badger’s LSM tree decreases in size as we progress from 128B to 16KB. In all three scenarios, Badger produced an LSM tree which could fit entirely in RAM of the target server.


Random read latency: Badger’s Get latency is only 18% to 27% of RocksDB’s Get latency. In our opinion, this is the biggest win of this design. This happens because Badger’s entire LSM tree can fit into RAM, significantly decreasing the amount of time it takes to find the right tables, check their bloom filters, pick the right blocks and retrieve the key. Value retrieval is then a single SSD file.pread away.


In contrast, RocksDB can’t fit the entire tree in memory. Even assuming it can keep the table index and bloom filters in memory, it would need to fetch the entire blocks from disk, decompress them, then do key-value retrieval (Badger’s smaller LSM tree avoids the need for compression). This obviously takes longer, and given lack of data access locality, caching isn’t as effective.


Range iteration latency: Badger’s range iteration is significantly slower than RocksDB’s range iteration, when values are also retrieved from SSD. We didn’t expect this, and still don’t quite understand it. We expected some slowdown due to the need to do IOPS on SSD, while RocksDB does purely serial reads. But, given the 100K IOPS i3.large instance is capable of, we didn’t even come close to using that bandwidth, despite pre-fetching. This needs further work and investigation.


On the other end of the spectrum, Badger’s key-only iteration is blazingly faster than RocksDB or key-value iteration (latency is shown by the almost invisible red bar). This is quite useful in certain use cases we have at Dgraph, where we iterate over the keys, run filters and only retrieve values for a much smaller subset of keys.


Further work


Speed of range iteration


While Badger can do key-only iteration blazingly fast, things slow down when it also needs to do value lookups. Theoretically, this shouldn’t be the case. Amazon’s i3.large disk optimized instance can do 100,000 4KB block random reads per second. Based on this, we should be able to iterate 100K key-value pairs per second, in other terms six million key-value pairs per minute.


However, Badger’s current implementation doesn’t produce SSD random read requests even close to this limit, and the key-value iteration suffers as a result. There’s a lot of room for optimization in this space.


Speed of compactions


Badger is currently slower when it comes to running compactions compared to RocksDB. Due to this, for a dataset purely containing smaller values, it is slower to load data to Badger. This needs more optimization.


LSM tree compression


Again in a dataset purely containing smaller values, the size of LSM tree would be significantly larger than RocksDB because Badger doesn’t run compression on LSM tree. This should be easy to add on if needed, and would make a great first-time contributor project.


B+ tree approach


1 Recent improvements to SSDs might make B+-trees a viable option. Since WiscKey paper was written, SSDs have made huge gains in random write performance. A new interesting direction would be to combine the value log approach, and keep only keys and value pointers in the B+-tree. This would trade LSM tree read-sort-merge sequential write compactions with many random writes per key update and might achieve the same write throughput as LSM for a much simpler design.


Conclusion


We have built an efficient key-value store, which can compete in performance against top of the line key-value stores in market. It is currently rough around the edges, but provides a solid platform for any industrial application, be it data storage or building another database.


We will be replacing Dgraph’s dependency on RocksDB soon with Badger; making our builds easier, faster, making Dgraph cross-platform and paving the way for embeddable Dgraph. The biggest win of using Badger is a performant Go native key-value store. The nice side-effects are ~4 times faster Get and a potential 86% reduction in AWS bills, due to less reliance on RAM and more reliance on ever faster and cheaper SSDs.


So try out Badger in your project, and let us know your experience.


P.S. Special thanks to Sanjay Ghemawat and Lanyue Lu for responding to my questions about design choices.






**We are building an open source, real time, horizontally scalable and distributed graph database.**









































Get started with Dgraph. [https://docs.dgraph.io](https://docs.dgraph.io)
See our live demo. [https://dgraph.io](https://dgraph.io)
Star us on Github. [https://github.com/dgraph-io/dgraph](https://github.com/dgraph-io/dgraph)
Ask us questions. [https://discuss.dgraph.io](https://discuss.dgraph.io)


**We're starting to support enterprises in deploying Dgraph in production. [Talk to us](manish@dgraph.io), if you want us to help you try out Dgraph at your organization.**




*Top image: Juno spacecraft is the [fastest moving human made object](http://www.livescience.com/326 ... r.html), traveling at a speed of 265,00 kmph relative to Earth.*

Golang web starter

dasheng 发表了文章 • 2 个评论 • 519 次浏览 • 2017-07-30 00:13 • 来自相关话题

背景

Web应用长期以来是Ruby、Java、PHP等开发语言的战场。

  • Ruby可以实现快速原型开发,Ruby On Rails “全能”框架实现“全栈”开发,缺点有大型应用性能差、调试困难;
  • 查看全部

背景


Web应用长期以来是Ruby、Java、PHP等开发语言的战场。



  • Ruby可以实现快速原型开发,Ruby On Rails “全能”框架实现“全栈”开发,缺点有大型应用性能差、调试困难;

  • Java 20多年的发展历程,各种第三方库、框架健全,运行效率高,但是随着应用的功能膨胀,臃肿的get/set方法,JVM占用大量计算机资源、性能调试困难,函数式编程不友好。

  • PHP,TL;DR


本文实现了一个最小化web应用,以此来了解Golang web的生态,通过使用Docker隔离开发环境,使用Posgres持久化数据,源代码请参考这里


Why Go?



  • 性能优越

  • 部署简单,只需要将打包好的二进制文件部署到服务器上

  • 内置丰富的标准库,让程序员的生活变得简单美好

  • 静态语言,类型检查

  • duck typing

  • goroutine将开发人员从并发编程中解放出来

  • 函数作为“一等公民”

  • ...


Golang第三方框架选择



  • Web框架: Gin,性能卓越,API友好,功能完善

  • ORM: GORM,支持多种主流数据库方言,文档清晰

  • 包管理工具: Glide,类似于Ruby的bundler或者NodeJS中的npm

  • 测试工具:

    • GoConvey,符合BDD测试风格,支持浏览器测试结果的可视化

    • Testify,提供丰富的断言和Mock功能


  • 数据库migration: migrate

  • 日志工具: Logrus,结构化日志输出,完全兼容标准库的logger


Dockerize开发环境


发布应用base image


Dockerfile如下:


FROM golang:1.8

# 包管理工具
RUN curl https://glide.sh/get | sh

# 代码热加载
RUN go get github.com/codegangsta/gin

# 数据库migration工具
RUN go get -u -d github.com/mattes/migrate/cli github.com/lib/pq
RUN go build -tags 'postgres' -o /usr/local/bin/migrate github.com/mattes/migrate/cli

发布数据库base image


Dockerfile如下:


FROM postgres:9.6

# 初始化数据库配置
COPY ./init-user-db.sh /docker-entrypoint-initdb.d/init-user-db.sh

启动服务


运行auto/dev即可启动,具体的配置如下。



  • docker-compose.yml:


version: "3"

services:
dev:
links:
- db
image: 415148673/golang-web-base-image@sha256:18de5eb058a54b64f32d58b57a1eb3009b9ed49d90bd53056b95c5c8d5894cd6
environment:
- PORT=8080
- DB_USER=docker
- DB_HOST=db
- DB_NAME=webstarter
volumes:
- .:/go/src/golang-web-starter
working_dir: /go/src/golang-web-starter
ports:
- "3000:3000"
command: gin

db:
image: 415148673/postgres@sha256:6d4800c53e68576e05d3a61f2b62ed573f40692bcc72a3ebef3b04b3986bb70c
volumes:
- go-web-starter-db-cache:/var/lib/postgresql/data

volumes:
go-web-starter-db-cache:


  • 安装第三方依赖所需的glide配置文件,通过在容器内运行glide install进行安装:


package: golang-web-starter
import:
- package: github.com/gin-gonic/gin
version: ^1.1.4
- package: github.com/jinzhu/gorm
version: ^1.0.0
- package: github.com/mattes/migrate
version: ^3.0.1
- package: github.com/lib/pq
- package: github.com/stretchr/testify
version: ^1.1.4
- package: github.com/smartystreets/goconvey
version: ^1.6.2


  • 数据库migration的脚本:


migrate -source file://migrations -database "postgres://$DB_USER:$DB_PASSWORD@$DB_HOST:5432/$DB_NAME?sslmode=disable" up

业务实现


Router


router := gin.Default()
router.GET("/", handler.ShowIndexPage) // 显示主页面
router.GET("/book/:book_id", handler.GetBook) // 通过id查询书籍
router.POST("/book", handler.SaveBook) // 保存书籍

handler


以保存书籍为例:


func SaveBook(c *gin.Context)  {
var book models.Book
if err := c.Bind(&book); err == nil {
// 调用model的保存方法
book := models.SaveBook(book)

// 绑定前端页面所需数据
utility.Render(
c,
gin.H{
"title": "Save",
"payload": book,
},
"success.html",
)
} else {
// 异常处理
c.AbortWithError(http.StatusBadRequest, err)
}
}

model


func SaveBook(book Book) Book {
// 持久化数据
utility.DB().Create(&book)
return book;
}

建立DB连接


func DB() *gorm.DB {
dbInfo := fmt.Sprintf(
"host=%s user=%s dbname=%s sslmode=disable password=%s",
os.Getenv("DB_HOST"),
os.Getenv("DB_USER"),
os.Getenv("DB_NAME"),
os.Getenv("DB_PASSWORD"),
)
db, err := gorm.Open("postgres", dbInfo)
if err != nil {
log.Fatal(err)
}
return db
}

View


<body class="container">
{{ template "menu.html" . }}
<label>保存成功</label>
<h1>{{.payload.Title}}</h1>
<p>{{.payload.Author}}</p>
{{ template "footer.html" .}}
</body>

测试


func TestSaveBook(t *testing.T) {
r := utility.GetRouter(true)
r.POST("/book", SaveBook)

Convey("The params can not convert to model book", t, func() {
req, _ := http.NewRequest("POST", "/book", nil)
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")

utility.TestHTTPResponse(r, req, func(w *httptest.ResponseRecorder) {
So(w.Code, ShouldEqual, http.StatusBadRequest)
})
})

Convey("The params can convert to model book", t, func() {
req, _ := http.NewRequest("POST", "/book", strings.NewReader("title=Hello world&author=will"))
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
utility.TestHTTPResponse(r, req, func(w *httptest.ResponseRecorder) {
p, _ := ioutil.ReadAll(w.Body)
So(w.Code, ShouldEqual, http.StatusOK)
So(string(p), ShouldContainSubstring, "保存成功")
})
})
}

总结


Go生态之活跃令我大开眼界,著名的应用如ocker, Ethereum都是使用Go编写的。使用Go进行web开发的过程,感觉和搭积木一样,一个合适的第三方库需要在多个候选库中精心筛选,众多的开源作者共同构建了一个“模块”王国。在这样的环境中,编程变成了一件很自由的事情。由于Go的标准库提供了很多内置的实用命令如go fmt,go test,让编程变得异常轻松,简直是强迫型程序员的“天堂”。
当然Go语言还处在发展过程中,也有许多不完善的地方,比如



  • 缺少标准的依赖管理工具(正在开发的dep

  • 非中心化的依赖仓库会出现由于某个依赖被删除导致应用不可用等。


欢迎关注我的微信公众平台,更多随笔随后更新:
whisperd

用go 简单实现的LRU

lys86_1205 发表了文章 • 0 个评论 • 337 次浏览 • 2017-07-27 14:02 • 来自相关话题

LRU

LRU