GoReporter第三版

fiisio 发表了文章 • 0 个评论 • 162 次浏览 • 3 天前 • 来自相关话题

GoReporter第三版重构了展示页面,分类更清晰,展示模型更多。可以作为白盒测试,CodeReview助手或者最佳实践评估工具。

欢迎大家使用和提出改进建议或者帮助完善功能! 查看全部

GoReporter第三版重构了展示页面,分类更清晰,展示模型更多。可以作为白盒测试,CodeReview助手或者最佳实践评估工具。


欢迎大家使用和提出改进建议或者帮助完善功能!
https://github.com/360EntSecGroup-Skylar/goreporter

一个自己实现的Excel as relate db读取库go-excel

yhf_szb 发表了文章 • 0 个评论 • 233 次浏览 • 2017-09-18 17:10 • 来自相关话题

在复杂的系统中(例如游戏),有时候为了便于非专业人员(策划)设置一些配置,会使用Excel作为一种轻量级的关系数据库或者配置文件,毕竟对于很多非开发人员来说,配个Excel要比写json或者yaml什么简单得多。

而且Excel可以写入各种... 查看全部

在复杂的系统中(例如游戏),有时候为了便于非专业人员(策划)设置一些配置,会使用Excel作为一种轻量级的关系数据库或者配置文件,毕竟对于很多非开发人员来说,配个Excel要比写json或者yaml什么简单得多。


而且Excel可以写入各种格式和字体标红单元格,维护成本大大降低。


这种场景下,读取特定格式(符合关系数据库特点的表格)的数据会比各种花式写入Excel的功能更重要,毕竟从编辑上来说微软提供的Excel本身功能就非常强大了,而现在我找到的Excel库的功能都过于强大了,用起来有点浪费,于是写了这个简化库。


这个库的工作参考了tealeg/xlsx的部分实现和读取逻辑。


假设有一个xlsx文件,里边有个Sheet叫“Standard”,它的数据结构如下:










































ID NameOf Age Slice UnmarshalString
1 Andy 1 1|2 {"Foo":"Andy"}
2 Leo 2 2|3|4 {"Foo":"Leo"}
3 Ben 3 3|4|5|6 {"Foo":"Ben"}
4 Ming 4 1 {"Foo":"Ming"}


  • 第0行是标题行。

  • 第1行开始是数据行。


以下是最简单的写法


// defined a struct
type Standard struct {
// use field name as default column name
ID int
// column means to map the column name
Name string `xlsx:"column(NameOf)"`
// you can map a column into more than one field
NamePtr *string `xlsx:"column(NameOf)"`
// omit `column` if only want to map to column name, it's equal to `column(AgeOf)`
Age int `xlsx:"AgeOf"`
// split means to split the string into slice by the `|`
Slice []int `xlsx:"split(|)"`
// *Temp implement the `encoding.BinaryUnmarshaler`
Temp *Temp `xlsx:"column(UnmarshalString)"`
// use '-' to ignore.
Ignored string `xlsx:"-"`
}

// func (this Standard) GetXLSXSheetName() string {
// return "Some other sheet name if need"
// }

type Temp struct {
Foo string
}

// self define a unmarshal interface to unmarshal string.
func (this *Temp) UnmarshalBinary(d []byte) error {
return json.Unmarshal(d, this)
}

func main() {
// will assume the sheet name as "Standard" from the struct name.
var stdList []Standard
err := excel.UnmarshalXLSX("./testdata/simple.xlsx", &stdList)
if err != nil {
panic(err)
}
}

提供一些更复杂的读取逻辑,详细看文档:https://github.com/szyhf/go-excel


时间关系,可能看test目录里的代码更好理解……



欢迎捉虫提bug。


golang版本的curl请求库

mikemintang 发表了文章 • 0 个评论 • 340 次浏览 • 2017-09-15 07:55 • 来自相关话题

Github地址


https://github.com/mikemintang/go-curl


安装


go get github.com/mikemintang/go-curl

使用


package main

import (
"fmt"
"github.com/mikemintang/go-curl"
)

func main() {

url := "http://php.dev/api.php"

headers := map[string]string{
"User-Agent": "Sublime",
"Authorization": "Bearer access_token",
"Content-Type": "application/json",
}

cookies := map[string]string{
"userId": "12",
"loginTime": "15045682199",
}

queries := map[string]string{
"page": "2",
"act": "update",
}

postData := map[string]interface{}{
"name": "mike",
"age": 24,
"interests": []string{"basketball", "reading", "coding"},
"isAdmin": true,
}

// 链式操作
req := curl.NewRequest()
resp, err := req.
SetUrl(url).
SetHeaders(headers).
SetCookies(cookies).
SetQueries(queries).
SetPostData(postData).
Post()

if err != nil {
fmt.Println(err)
} else {
if resp.IsOk() {
fmt.Println(resp.Body)
} else {
fmt.Println(resp.Raw)
}
}

}

接收请求的api.php


<?php  

//echo json_encode($_GET); // 获取url地址中的查询参数
//echo json_encode(getallheaders()); // 获取请求头
//echo json_encode($_COOKIE); // 获取cookies
echo file_get_contents("php://input"); // 获取post提交的数据

function getallheaders() {
$headers = [];
foreach ($_SERVER as $name => $value) {
if (substr($name, 0, 5) == 'HTTP_') {
$headers[str_replace(' ', '-', ucwords(strtolower(str_replace('_', ' ', substr($name, 5)))))] = $value;
}
}
return $headers;
}

可导出的成员变量和方法



TodoList



  • [x] 以链式操作的方式发起请求

  • [ ] 以函数回调的方式发起请求

  • [ ] 以类Jquery Ajax的方式发起请求

  • [x] 发起GET/POST请求

  • [ ] 发起PUT/PATCH/DELETE/OPTIONS操作

  • [x] 以application/x-www-form-urlencoded形式提交post数据

  • [x] 以application/json形式提交post数据

  • [ ] 以multipart/form-data形式提交post数据

  • [ ] proxy代理设置

grpc-gateway的替代品--Turbo

zzxx513 发表了文章 • 0 个评论 • 285 次浏览 • 2017-09-15 01:06 • 来自相关话题

转载自:https://zhuanlan.zhihu.com/p/29350695

grpc-gateway是一个使用起来很便捷的工具,... 查看全部

转载自:https://zhuanlan.zhihu.com/p/29350695


grpc-gateway是一个使用起来很便捷的工具,可以很方便的把grpc接口用HTTP的方式暴露出去。


但在实际使用过程中,也在grpc-gateway里发现了一些问题,比如:


1,灵活性不够,如果有一些比较特殊的需求,在grpc-gateway中能扩展的余地不大;


2,严重依赖protocol buffer,而且必须是protobuf 3;


3,即使grpc服务的接口不变,只是修改HTTP接口定义,也必须重新生成代码,也就必须重新部署,重启服务;


4,只支持JSON格式的输入,不支持传统的kv格式的参数;


5,只支持grpc,嗯。。好吧,这不算问题,但thrift也很普及,是不是?


6,grpc-gateway在错误处理等方面都不够成熟,而且开发者似乎也不是很活跃。。。


Turbo努力解决了上面提到的问题,这是项目的地址:


vaporz/turbo


这是文档地址,很贴心很详细,中英双语哦~


Turbo Documentation


除了提供基本的与grpc-gateway类似的HTTP代理功能,Turbo还可以做到:


1,高度灵活,提供各种基于切面(不是吃的那个“切面”)思想的组件,可以在各个环节进行定制;


2,只依赖grpc,对protocol buffer没有要求,因此,你既可以使用protobuf2,也可以使用protobuf3;


3,HTTP接口的定义,以及与后端接口之间的映射,可以在运行时直接修改,并且立即生效!


4,不仅支持JSON格式的输入,也支持传统的kv格式的输入!


5,不仅支持grpc,还支持thrift!


6,自带命令行工具,一键创建可运行的项目,一键重新生成代码!


Turbo目前仍处于诞生初期,但现在的代码已经经过了认真的测试,认真细致的测试用例让测试覆盖率达到了98%。


当然,测试覆盖率说明不了多少问题,只有经过实战考验的代码才是可靠的!


因此,欢迎大家多多试用,多多吐槽,有任何建议或想法,请在GitHub上开Issue,坐等。


遇到任何问题,我愿意尽力帮助,尽力解决!


谢谢!

Go 语言编写轻量级网络库,GrapeNet

koangel 发表了文章 • 0 个评论 • 468 次浏览 • 2017-08-20 11:36 • 来自相关话题

简介(Introduction)

Go语言编写轻量级网络库 (grapeNet is a lightweight and Easy Use Network Framework)

可用于游戏服务端、强网络服务器端或其他类似应... 查看全部

简介(Introduction)


Go语言编写轻量级网络库 (grapeNet is a lightweight and Easy Use Network Framework)


可用于游戏服务端、强网络服务器端或其他类似应用场景,每个模块单独提取并且拥有独立的使用方法,内部耦合性较轻。


其实GO语言曾经有过很多强架构的框架,比如GOWOLRD之类的,已经足够了,但是我会将库用于各种轻量级应用不需要过于复杂的内容,所以我设计了GrapeNet,目的是模块独立化。 你可以拆开只使用其中很小的模块,也可以组合成一个服务端,并且在架构中设计也较为轻松,至于热更新的问题,目前脚本数据支持热更新,并且是自动的,只要跑一下UPDATE即可,程序本身稍后测试后发布(仅支持LINUX)。


本库更像是一个日常服务端开发的轻量级工具库集合,用的开心噢。


慢慢更新中,很多坑要填,目前暂不适合用于商业项目。


个人博客:http://grapec.me/



安装


go get -u github.com/koangel/grapeNet...


模块表(Function)



  • Lua脚本绑定管理(可绑定任何类型的函数、线程安全且自动推倒类型)

  • 日志库(底层采用Seelog)

  • 函数管理系统(可以根据任何类型参数将其与函数绑定并互相调用)

  • 流处理

  • Tcp网络

  • Websocket网络 (基础版)

  • Codec(任意类型注册对象并在其他位置动态创建该对象)

  • CSV序列化模块(通过Tag可以直接序列化到对象或对象序列化为CSV)


依赖第三方



  • Seelog (github.com/cihub/seelog)

  • Gopher-lua(github.com/yuin/gopher-lua)

  • Gopher-luar(layeh.com/gopher-luar)


不依赖任何CGO内容,lua本身也是纯GO实现。

Introducing Badger: A fast key-value store written purely in Go

chenxu 发表了文章 • 0 个评论 • 249 次浏览 • 2017-08-18 17:52 • 来自相关话题

We have built an efficient and persistent log structured mer... 查看全部


We have built an efficient and persistent log structured merge (LSM) tree based key-value store, purely in Go language. It is based upon WiscKey paper included in USENIX FAST 2016. This design is highly SSD-optimized and separates keys from values to minimize I/O amplification; leveraging both the sequential and the random performance of SSDs.


We call it Badger. Based on benchmarks, Badger is at least 3.5x faster than RocksDB when doing random reads. For value sizes between 128B to 16KB, data loading is 0.86x - 14x faster compared to RocksDB, with Badger gaining significant ground as value size increases. On the flip side, Badger is currently slower for range key-value iteration, but that has a lot of room for optimization.


Background and Motivation


Word about RocksDB


RocksDB is the most popular and probably the most efficient key-value store in the market. It originated in Google as SSTable which formed the basis for Bigtable, then got released as LevelDB. Facebook then improved LevelDB to add concurrency and optimizations for SSDs and released that as RocksDB. Work on RocksDB has been continuously going on for many years now, and it’s used in production at Facebook and many other companies.


So naturally, if you need a key-value store, you’d gravitate towards RocksDB. It’s a solid piece of technology, and it works. The biggest issue with using RocksDB is that it is written in C++; requiring the use of Cgo to be called via Go.


Cgo: The necessary evil


At Dgraph, we have been using RocksDB via Cgo since we started. And we’ve faced many issues over time due to this dependency. Cgo is not Go, but when there are better libraries in C++ than Go, Cgo is a necessary evil.


The problem is, Go CPU profiler doesn’t see beyond Cgo calls. Go memory profiler takes it one step further. Forget about giving you memory usage breakdown in Cgo space, Go memory profiler fails to even notice the presence of Cgo code. Any memory used by Cgo would not even make it to the memory profiler. Other tools like Go race detector, don’t work either.


Cgo has caused us pthread_create issues in Go1.4, and then again in Go1.5, due to a bug regression. Lightweight goroutines become expensive pthreads when Cgo is involved, and we had to modify how we were writing data to RocksDB to avoid assigning too many goroutines.


Cgo has caused us memory leaks. Who owns and manages memory when making calls is just not clear. Go, and C are at the opposite spectrums. One doesn’t let you free memory, the other requires it. So, you make a Go call, but then forget to Free(), and nothing breaks. Except much later.


Cgo has given us a unmaintainable code. Cgo makes code ugly. The Cgo layer between RocksDB was the one piece of code no one in the team wanted to touch.


Surely, we fixed the memory leaks in our API usage over time. In fact, I think we have fixed them all by now, but I can’t be sure. Go memory profiler would never tell you. And every time someone complains about Dgraph taking up more memory or crashing due to OOM, it makes me nervous that this is a memory leak issue.


Huge undertaking


Everyone I told about our woes with Cgo, told me that we should just work on fixing those issues. Writing a key-value store which can provide the same performance as RocksDB is a huge undertaking, not worth our effort. Even my team wasn’t sure. I had my doubts as well.


I have great respect for any piece of technology which has been iterated upon by the smartest engineers on the face of the planet for years. RocksDB is that. And if I was writing Dgraph in C++, I’d happily use it.



But, I just hate ugly code.



And I hate recurring bugs. No amount of effort would have ensured that we would no longer have any more issues with using RocksDB via Cgo. I wanted a clean slate, and my profiler tools back. Building a key-value store in Go from scratch was the only way to achieve it.


I looked around. The existing key-value stores written in Go didn’t even come close to RocksDB’s performance. And that’s a deal breaker. You don’t trade performance for cleanliness. You demand both.


So, I decided we will replace our dependency on RocksDB, but given this isn’t a priority for Dgraph, none of the team members should work on it. This would be a side project that only I will undertake. I started reading up about B+ and LSM trees, recent improvements to their design, and came across WiscKey paper. It had great promising ideas. I decided to spend a month away from core Dgraph, building Badger.


That’s not how it went. I couldn’t spend a month away from Dgraph. Between all the founder duties, I couldn’t fully dedicate time to coding either. Badger developed during my spurts of coding activity, and one of the team members’ part-time contributions. Work started end January, and now I think it’s in a good state to be trialed by the Go community.


LSM trees


Before we delve into Badger, let’s understand key-value store designs. They play an important role in data-intensive applications including databases. Key-value stores allow efficient updates, point lookups and range queries.


There are two popular types of implementations: Log-structured merge (LSM) tree based, and B+ tree based. The main advantage LSM trees have is that all the foreground writes happen in memory, and all background writes maintain sequential access patterns. Thus they achieve a very high write thoughput. On the other hand, small updates on B+-trees involve repeated random disk writes, and hence are unable to maintain high throughput write workload1.


To deliver high write performance, LSM-trees batch key-value pairs and write them sequentially. Then, to enable efficient lookups, LSM-trees continuously read, sort and write key-value pairs in the background. This is known as a compaction. LSM-trees do this over many levels, each level holding a factor more data than the previous, typically size of Li+1 = 10 x size of Li.


Within a single level, the key-values get written into files of fixed size, in a sorted order. Except level zero, all other levels have zero overlaps between keys stored in files at the same level.


Each level has a maximum capacity. As a level Li fills up, its data gets merged with data from lower level Li+1 and files in Li deleted to make space for more incoming data. As data flows from level zero to level one, two, and so on, the same data is re-written multiple times throughout its lifetime. Each key update causes many writes until data eventually settles. This constitutes write amplification. For a 7 level LSM tree, with 10x size increase factor, this can be 60; 10 for each transition from L1->L2, L2->L3, and so on, ignoring L0 due to special handling.


Conversely, to read a key from LSM tree, all the levels need to be checked. If present in multiple levels, the version of key at level closer to zero is picked (this version is more up to date). Thus, a single key lookup causes many reads over files, this constitutes read amplification. WiscKey paper estimates this to be 336 for a 1-KB key-value pair.


LSMs were designed around hard drives. In HDDs, random I/Os are over 100x slower than sequential ones. Thus, running compactions to continually sort keys and enable efficient lookups is an excellent trade-off.


NVMe SSD Samsung 960 pro


However, SSDs are fundamentally different from HDDs. The difference between their sequential and random reads are not nearly as large as HDDs. In fact, top of the line SSDs like Samsung 960 Pro can provide 440K random read operations per second, with 4KB block size. Thus, an LSM-tree that performs a large number of sequential writes to reduce later random reads is wasting bandwidth needlessly.


Badger


Badger is a simple, efficient, and persistent key-value store. Inspired by the simplicity of LevelDB, it provides Get, Set, Delete, and Iterate functions. On top of it, it adds CompareAndSet and CompareAndDelete atomic operations (see GoDoc). It does not aim to be a database and hence does not provide transactions, versioning or snapshots. Those things can be easily built on top of Badger.


Badger separates keys from values. The keys are stored in LSM tree, while the values are stored in a write-ahead log called the value log. Keys tend to be smaller than values. Thus this set up produces much smaller LSM trees. When required, the values are directly read from the log stored on SSD, utilizing its vastly superior random read performance.


Guiding principles


These are the guiding principles that decide the design, what goes in and what doesn’t in Badger.



  • Write it purely in Go language.

  • Use the latest research to build the fastest key-value store.

  • Keep it simple, stupid.

  • SSD-centric design.


Key-Value separation


The major performance cost of LSM-trees is the compaction process. During compactions, multiple files are read into memory, sorted, and written back. Sorting is essential for efficient retrieval, for both key lookups and range iterations. With sorting, the key lookups would only require accessing at most one file per level (excluding level zero, where we’d need to check all the files). Iterations would result in sequential access to multiple files.


Each file is of fixed size, to enhance caching. Values tend to be larger than keys. When you store values along with the keys, the amount of data that needs to be compacted grows significantly.


In Badger, only a pointer to the value in the value log is stored alongside the key. Badger employs delta encoding for keys to reduce the effective size even further. Assuming 16 bytes per key and 16 bytes per value pointer, a single 64MB file can store two million key-value pairs.


Write Amplification


Thus, the LSM tree generated by Badger is much smaller than that of RocksDB. This smaller LSM-tree reduces the number of levels, and hence number of compactions required to achieve stability. Also, values are not moved along with keys, because they’re elsewhere in value log. Assuming 1KB value and 16 byte keys, the effective write amplification per level is (10*16 + 1024)/(16 + 1024) ~ 1.14, a much smaller fraction.


You can see the performance gains of this approach compared to RocksDB as the value size increases; where loading data to Badger takes factors less time (see Benchmarks below).


Read Amplification


As mentioned above, the size of LSM tree generated by Badger is much smaller. Each file at each level stores lots more keys than typical LSM trees. Thus, for the same amount of data, fewer levels get filled up. A typical key lookup requires reading all files in level zero, and one file per level from level one and onwards. With Badger, filling fewer levels means, fewer files need to be read to lookup a key. Once key (along with value pointer) is fetched, the value can be fetched by doing random read in value log stored on SSD.


Furthermore, during benchmarking, we found that Badger’s LSM tree is so small, it can easily fit in RAM. For 1KB values and 75 million 22 byte keys, the raw size of the entire dataset is 72 GB. Badger’s LSM tree size for this setup is a mere 1.7G, which can easily fit into RAM. This is what causes Badger’s random key lookup performance to be at least 3.5x faster, and Badger’s key-only iteration to be blazingly faster than RocksDB.


Crash resilience


LSM trees write all the updates in memory first in memtables. Once they fill up, memtables get swapped over to immutable memtables, which eventually get written out to files in level zero on disk.


In the case of a crash, all the recent updates still in memory tables would be lost. Key-value stores deal with this issue, by first writing all the updates in a write-ahead log. Badger has a write-ahead log, it’s called value log.


Just like a typical write-ahead log, before any update is applied to LSM tree, it gets written to value log first. In the case of a crash, Badger would iterate over the recent updates in value log, and apply them back to the LSM tree.


Instead of iterating over the entire value log, Badger puts a pointer to the latest value in each memtable. Effectively, the latest memtable which made its way to disk would have a value pointer, before which all the updates have already made their way to disk. Thus, we can replay from this pointer onwards, and reapply all the updates to LSM tree to get all our updates back.


Overall size


RocksDB applies block compression to reduce the size of LSM tree. Badger’s LSM tree is much smaller in comparison and can be stored in RAM entirely, so it doesn’t need to do any compression on the tree. However, the size of value log can grow quite quickly. Each update is a new entry in the value log, and therefore multiple updates for the same key take up space multiple times.


To deal with this, Badger does two things. It allows compressing values in value log. Instead of compressing multiple key-values together, we only compress each key-value individually. This provides the best possible random read performance. The client can set it so compression is only done if the key-value size is over an adjustable threshold, set by default to 1KB.


Secondly, Badger runs value garbage collection. This runs periodically and samples a 100MB size of a randomly selected value log file. It checks if at least a significant chunk of it should be discarded, due to newer updates in later logs. If so, the valid key-value pairs would be appended to the log, the older file discarded, and the value pointers updated in the LSM tree. The downside is, this adds more work for LSM tree; so shouldn’t be run when loading a huge data set. More work is required to only trigger this garbage collection to run during periods of little client activity.


Hardware Costs


But, given the fact that SSDs are getting cheaper and cheaper, using extra space in SSD is almost nothing compared to having to store and serve a major chunk of LSM tree from memory. Consider this:


For 1KB values, 75 million 16 byte keys, RocksDB’s LSM tree is 50GB in size. Badger’s value log is 74GB (without value compression), and LSM tree is 1.7GB. Extrapolating it three times, we get 225 million keys, RocksDB size of 150GB and Badger size of 222GB value log, and 5.1GB LSM tree.


Using Amazon AWS US East (Ohio) datacenter:



  • To achieve a random read performance equivalent of Badger (at least 3.5x faster), RocksDB would need to be run on an r3.4xlarge instance, which provides 122 GB of RAM for $1.33 per hour; so most of its LSM tree can fit into memory.

  • Badger can be run on the cheapest storage optimized instance i3.large, which provides 475GB NVMe SSD (fio test: 100K IOPS for 4KB block size), with 15.25GB RAM for $0.156 per hour.

  • The cost of running Badger is thus, 8.5x cheaper than running RocksDB on EC2, on-demand.

  • Going 1-year term all upfront payment, this is $6182 for RocksDB v/s $870 for Badger, still 7.1x cheaper. That’s a whopping 86% saving.


Benchmarks


Setup


We rented a storage optimized i3.large instance from Amazon AWS, which provides 450GB NVMe SSD storage, 2 virtual cores along with 15.25GB RAM. This instance provides local SSD, which we tested via fio to sustain close to 100K random read IOPS for 4KB block sizes.


The data sets were chosen to generate sizes too big to fit entirely in RAM, in either RocksDB or Badger.




















































Value size Number of keys (each key = 22B) Raw data size
128B 250M 35GB
1024B 75M 73GB
16KB 5M 76GB

We then loaded data one by one, first in RocksDB then in Badger, never running the loaders concurrently. This gave us the data loading times and output sizes. For random Get and Iterate, we used Go benchmark tests and ran them for 3 minutes, going down to 1 minute for 16KB values.


All the code for benchmarking is available in this repo. All the commands ran and their measurements recorded are available in this log file. The charts and their data is viewable here.


Results


In the following benchmarks, we measured 4 things:



  • Data loading performance

  • Output size

  • Random key lookup performance (Get)

  • Sorted range iteration performance (Iterate)


All the 4 measurements are visualized in the following charts. [Badger](<a href=https://github.com/dgraph-io/badger) benchmarks" />


Data loading performance: Badger’s key-value separation design shows huge performance gains as value sizes increase. For value sizes of 1KB and 16KB, Badger achieves 4.5x and 11.7x more throughput than RocksDB. For smaller values, like 16 bytes not shown here, Badger can be 2-3x slower, due to slower compactions (see further work).


Store size: Badger generates much smaller LSM tree, but a larger value size log. The size of Badger’s LSM tree is proportional only to the number of keys, not values. Thus, Badger’s LSM tree decreases in size as we progress from 128B to 16KB. In all three scenarios, Badger produced an LSM tree which could fit entirely in RAM of the target server.


Random read latency: Badger’s Get latency is only 18% to 27% of RocksDB’s Get latency. In our opinion, this is the biggest win of this design. This happens because Badger’s entire LSM tree can fit into RAM, significantly decreasing the amount of time it takes to find the right tables, check their bloom filters, pick the right blocks and retrieve the key. Value retrieval is then a single SSD file.pread away.


In contrast, RocksDB can’t fit the entire tree in memory. Even assuming it can keep the table index and bloom filters in memory, it would need to fetch the entire blocks from disk, decompress them, then do key-value retrieval (Badger’s smaller LSM tree avoids the need for compression). This obviously takes longer, and given lack of data access locality, caching isn’t as effective.


Range iteration latency: Badger’s range iteration is significantly slower than RocksDB’s range iteration, when values are also retrieved from SSD. We didn’t expect this, and still don’t quite understand it. We expected some slowdown due to the need to do IOPS on SSD, while RocksDB does purely serial reads. But, given the 100K IOPS i3.large instance is capable of, we didn’t even come close to using that bandwidth, despite pre-fetching. This needs further work and investigation.


On the other end of the spectrum, Badger’s key-only iteration is blazingly faster than RocksDB or key-value iteration (latency is shown by the almost invisible red bar). This is quite useful in certain use cases we have at Dgraph, where we iterate over the keys, run filters and only retrieve values for a much smaller subset of keys.


Further work


Speed of range iteration


While Badger can do key-only iteration blazingly fast, things slow down when it also needs to do value lookups. Theoretically, this shouldn’t be the case. Amazon’s i3.large disk optimized instance can do 100,000 4KB block random reads per second. Based on this, we should be able to iterate 100K key-value pairs per second, in other terms six million key-value pairs per minute.


However, Badger’s current implementation doesn’t produce SSD random read requests even close to this limit, and the key-value iteration suffers as a result. There’s a lot of room for optimization in this space.


Speed of compactions


Badger is currently slower when it comes to running compactions compared to RocksDB. Due to this, for a dataset purely containing smaller values, it is slower to load data to Badger. This needs more optimization.


LSM tree compression


Again in a dataset purely containing smaller values, the size of LSM tree would be significantly larger than RocksDB because Badger doesn’t run compression on LSM tree. This should be easy to add on if needed, and would make a great first-time contributor project.


B+ tree approach


1 Recent improvements to SSDs might make B+-trees a viable option. Since WiscKey paper was written, SSDs have made huge gains in random write performance. A new interesting direction would be to combine the value log approach, and keep only keys and value pointers in the B+-tree. This would trade LSM tree read-sort-merge sequential write compactions with many random writes per key update and might achieve the same write throughput as LSM for a much simpler design.


Conclusion


We have built an efficient key-value store, which can compete in performance against top of the line key-value stores in market. It is currently rough around the edges, but provides a solid platform for any industrial application, be it data storage or building another database.


We will be replacing Dgraph’s dependency on RocksDB soon with Badger; making our builds easier, faster, making Dgraph cross-platform and paving the way for embeddable Dgraph. The biggest win of using Badger is a performant Go native key-value store. The nice side-effects are ~4 times faster Get and a potential 86% reduction in AWS bills, due to less reliance on RAM and more reliance on ever faster and cheaper SSDs.


So try out Badger in your project, and let us know your experience.


P.S. Special thanks to Sanjay Ghemawat and Lanyue Lu for responding to my questions about design choices.






**We are building an open source, real time, horizontally scalable and distributed graph database.**









































Get started with Dgraph. [https://docs.dgraph.io](https://docs.dgraph.io)
See our live demo. [https://dgraph.io](https://dgraph.io)
Star us on Github. [https://github.com/dgraph-io/dgraph](https://github.com/dgraph-io/dgraph)
Ask us questions. [https://discuss.dgraph.io](https://discuss.dgraph.io)


**We're starting to support enterprises in deploying Dgraph in production. [Talk to us](manish@dgraph.io), if you want us to help you try out Dgraph at your organization.**




*Top image: Juno spacecraft is the [fastest moving human made object](http://www.livescience.com/326 ... r.html), traveling at a speed of 265,00 kmph relative to Earth.*

用go 简单实现的LRU

lys86_1205 发表了文章 • 0 个评论 • 296 次浏览 • 2017-07-27 14:02 • 来自相关话题

LRU

LRU

对于 Go 中的实用函数我有话说

taowen 发表了文章 • 0 个评论 • 466 次浏览 • 2017-07-09 23:13 • 来自相关话题

目标:让 Go 中支持类似下面这样的函数

func Max(collection ...interface{}) interf... 			查看全部
					

目标:让 Go 中支持类似下面这样的函数


func Max(collection ...interface{}) interface{}

问题:如果用反射实现的话,效率是问题。


解决方案:json.Unmarshal 就是用反射实现的,jsoniter 通过用 unsafe.Pointer 加上缓存的 decoder 实现了6倍的速度提升。所以尝试用同样的技术原理,写一个概念验证的原型 https://github.com/v2pro/wombat


实现的API类似这样


import (
"testing"
"github.com/stretchr/testify/require"
"github.com/v2pro/plz"
)

func Test_max_min(t *testing.T) {
should := require.New(t)
should.Equal(3, plz.Max(1, 3, 2))
should.Equal(1, plz.Min(1, 3, 2))

type User struct {
Score int
}
should.Equal(User{3}, plz.Max(
User{1}, User{3}, User{2},
"Score"))
}

其中的原理是从 interface{} 中提取 unsafe.Pointer。 然后用 Accessor 获得具体的值。这个 Accessor 的概念是和类对应的,而不是和值对应的。也就是相当于 type.GetIntValue(interface{}) 这样的意思。这个在 Java 的反射 API 中是支持的,而 Go 没有提供这样的 API。利用 Accessor 我们可以一次性计算好整个任务,然后缓存起来。这样运行期的成本大概就是虚函数调用的成本。


Accessor 的接口定义


type Accessor interface {
// === static ===
fmt.GoStringer
Kind() Kind
// map
Key() Accessor
// array/map
Elem() Accessor
// struct
NumField() int
Field(index int) StructField
// array/struct
RandomAccessible() bool
New() (interface{}, Accessor)

// === runtime ===
IsNil(ptr unsafe.Pointer) bool
// variant
VariantElem(ptr unsafe.Pointer) (elem unsafe.Pointer, elemAccessor Accessor)
InitVariant(ptr unsafe.Pointer, template Accessor) (elem unsafe.Pointer, elemAccessor Accessor)
// map
MapIndex(ptr unsafe.Pointer, key unsafe.Pointer) (elem unsafe.Pointer) // only when random accessible
SetMapIndex(ptr unsafe.Pointer, key unsafe.Pointer, elem unsafe.Pointer) // only when random accessible
IterateMap(ptr unsafe.Pointer, cb func(key unsafe.Pointer, elem unsafe.Pointer) bool)
FillMap(ptr unsafe.Pointer, cb func(filler MapFiller))
// array/struct
ArrayIndex(ptr unsafe.Pointer, index int) (elem unsafe.Pointer) // only when random accessible
IterateArray(ptr unsafe.Pointer, cb func(index int, elem unsafe.Pointer) bool)
FillArray(ptr unsafe.Pointer, cb func(filler ArrayFiller))
// primitives
Skip(ptr unsafe.Pointer) // when the value is not needed
String(ptr unsafe.Pointer) string
SetString(ptr unsafe.Pointer, val string)
Bool(ptr unsafe.Pointer) bool
SetBool(ptr unsafe.Pointer, val bool)
Int(ptr unsafe.Pointer) int
SetInt(ptr unsafe.Pointer, val int)
Int8(ptr unsafe.Pointer) int8
SetInt8(ptr unsafe.Pointer, val int8)
Int16(ptr unsafe.Pointer) int16
SetInt16(ptr unsafe.Pointer, val int16)
Int32(ptr unsafe.Pointer) int32
SetInt32(ptr unsafe.Pointer, val int32)
Int64(ptr unsafe.Pointer) int64
SetInt64(ptr unsafe.Pointer, val int64)
Uint(ptr unsafe.Pointer) uint
SetUint(ptr unsafe.Pointer, val uint)
Uint8(ptr unsafe.Pointer) uint8
SetUint8(ptr unsafe.Pointer, val uint8)
Uint16(ptr unsafe.Pointer) uint16
SetUint16(ptr unsafe.Pointer, val uint16)
Uint32(ptr unsafe.Pointer) uint32
SetUint32(ptr unsafe.Pointer, val uint32)
Uint64(ptr unsafe.Pointer) uint64
SetUint64(ptr unsafe.Pointer, val uint64)
Float32(ptr unsafe.Pointer) float32
SetFloat32(ptr unsafe.Pointer, val float32)
Float64(ptr unsafe.Pointer) float64
SetFloat64(ptr unsafe.Pointer, val float64)
}

利用这个 Accessor 可以干很多事情,除了各种函数式编程常用的utility(map/filter/sorted/...)之外。还可以实现一个 plz.Copy 的函数


func Copy(dst, src interface{}) error

Copy 可以用于各种对象绑定的场景



  • Go 不同类型对象之间的值拷贝(struct&map互相转换,兼容指针)

  • JSON 编解码

  • 拷贝 http.Request 到我的 struct 上

  • 拷贝 sql rows 到我的 struct 上

  • Mysql/thrift/redis 等其他协议的编解码


还可以用来实现 plz.Validate 的函数


func Validate(obj interface{}) error

甚至有可能的话,还可以把 .net 的 linq 的概念拿过来


func Query(obj interface{}, query string) (result interface{}, err error)

当然这个工作量非常浩大,比一个JSON解析库繁琐得多。现在只实现了几个概念原型:



用兴趣的朋友可以来发issue:https://github.com/v2pro/wombat/issues

百度AI服务go语言sdk

chenqinghe 发表了文章 • 0 个评论 • 456 次浏览 • 2017-07-07 19:07 • 来自相关话题

利用百度提供的REST API构建的Go语言sdk,目前有语音合成和语音识别功能,更多功能敬请期待。 项目地址:https://github....

RobotGo v0.45.0 发布, 增加进程管理和剪贴板

回复

veni 发起了问题 • 1 人关注 • 0 个回复 • 499 次浏览 • 2017-07-02 22:40 • 来自相关话题

GO解析PHP通过PHPCGI进行渲染

alalmn 发表了文章 • 0 个评论 • 382 次浏览 • 2017-06-10 15:06 • 来自相关话题

GO解析PHP通过PHPCGI 因为手上有别的事情这个项目就放这了 有兴趣的小伙伴大家子看吧 我QQ:29295842 GIT地址:查看全部

GO解析PHP通过PHPCGI
因为手上有别的事情这个项目就放这了
有兴趣的小伙伴大家子看吧
我QQ:29295842
GIT地址:https://github.com/webxscan/gophp/tree/master/gophp
GIT地址:https://github.com/webxscan/gophp/tree/master/gophp
GIT地址:https://github.com/webxscan/gophp/tree/master/gophp


golangPHPcgi GOphp--GO解析PHP源码并实现一个miniPHP服务起器


by


golang php cgi github:https://github.com/webxscan/gophp
BLOG: http://blog.csdn.net/webxscan/
BY:斗转星移 QQ:29295842


软件目的


实现一个本地PHP解析器,不用使用阿帕奇或者IIS。
这样就可以实现很多自定义扩展。
软件目前写了4天,还有很多不完美的地方还希望大家予以纠正。

RobotGo v0.44.0 发布,Go 桌面自动化

回复

veni 发起了问题 • 1 人关注 • 0 个回复 • 607 次浏览 • 2017-05-29 00:03 • 来自相关话题

Golang开源项目-GoScience,可以下载Scihub学术论文的代理服务器

wwdyy 发表了文章 • 0 个评论 • 381 次浏览 • 2017-05-28 19:50 • 来自相关话题

Github地址:https://github.com/GreatDanton/GoScience




  Sci-Hub 由 22 岁的哈萨克斯坦女孩 Alexandra Elbakyan 创建,为打破阻碍学术交流的「付费墙」,她结合自己的计算机技术,同时联合一些盗版网站建立了 Sci-Hub。Elbakyan 通过科研工作者自愿提供的帐号,从各大学的图书馆盗取了近 5000 万份资源,并将这些资源免费分享。


  Sci-Hub 的使用者中,有部分是本身拥有免费获取论文渠道的用户,但他们依旧会习惯性地使用 Sci-Hub,这其中重要原因在于其便利。


  打开 Sci-Hub 网站,用户只需输入文章的 DOI 或标题,就有可能找到全文。而且收录的文章绝大部分是曾被发表过的学术文章,而且数据库体量还在不断扩大:当有人搜索一篇还没有录入的文章时,Sci-Hub 就会盗版一份,并将其加入自己的文献库中。




  GoScience是Go中写的代理服务器,用于从Scihub下载学术论文,特别适用于您的机构/工作场所阻止了Scihub的访问,但您仍然需要访问科学文章。

分享一个小工具 Boast:如何从服务端跟踪所有 HTTP 请求,并方便回放?

dcb9 发表了文章 • 0 个评论 • 533 次浏览 • 2017-03-31 16:17 • 来自相关话题

原文链接:http://blog.phpor.me/note/2017/03/31/track-and-replay-http-request.html
客户端工程师:“xxx 接口坏了,我的程序都没动过”,后端经常会收到这样的质问,但是我们现在如何重现这个问题?有以下几种情况:


一、后端测试了一下发现没有问题


“我这里测试了是好的啊”,就只能让客户端工程师再操作一遍,亲眼看到错误之后就肯定是有问题了,就得去找问题,这时候这台手机,以及这台手机里面的数据都非常重要,因为这些数据可以让 Bug 重现。


二、测试了也有问题


这时候后端就去修改程序了,但是每一次的测试是否有问题都需要在客户端中操作,有时候的操作非常的复杂,在这上面花的时间会比较多。最后使了各种神通才终于找到问题,原来是这个用户的某某数据有异常才会出现这种情况。


以上这种情况屡见不鲜,最麻烦的点就在于,每次都要以出现 Bug 的相同参数去请求,有时候你知道这些请求的参数,可以把它们放到 Postman 这种工具里面,但大部分时候你并不知道它对应的参数 (token)


如果我们可以在服务端跟踪所有的请求:接口地址,Header,Body,后端返回的 Header、Body,这样我们就能查到对应的请求参数和返回值,可以直接填到 Postman 里面,要是还能一键重新请求就好了,因为我们不想修改请求的参数,只是想再以相同的参数请求一遍,这样我们来调试对应的程序。


正好以前用过 ngrok,发现它有一个非常好的 debug 界面,可以达到以上的要求,但现在不需要它的内网穿透功能,于是只能自己写一个程序,只包含以下功能:



  • 记录接口所有的 Request 和 Response

  • 可以一键重新请求某个 Request


基本工作原理


HTTP 客户端                   Boast                       Web 服务器
| GET http://localhost:8080/ | 记录请求并进行反向代理 | Response 200 OK
| ---------------------------> | --------------------------> | ------┐
| | | |
| | 记录返回信息并转发给客户端 | <----┘
| <--------------------------- | <-------------------------- |

┌----------------------------------------------------------------------------┐
| url: http://localhost:8081 |
| ---------------------------------------------------------------------------|
| All Transactions ┌ - - - - - - - - - - - - - - - - - - - - - - - ┐ |
| ---------------------- | time: 10 hours ago Client: 127.0.0.1 | |
| |GET / 200 OK 100 ms | | | |
| ---------------------- | Request [ Replay ] | |
| | - - - - - - - - - - - - | |
| | GET http://localhost/ HTTP/1.1 | |
| | User-Agent: curl/7.51.0 | |
| | Accept: */* | |
| | | |
| | Response | |
| | - - - - - - - - - - - - | |
| | HTTP/1.1 200 OK | |
| | X-Server: HTTPLab | |
| | Date: Thu, 02 Mar 2017 02:25:27 GMT | |
| | Content-Length: 13 | |
| | Content-Type: text/plain; charset=utf-8 | |
| | | |
| | Hello, World | |
| └ - - - - - - - - - - - - - - - - - - - - - - - ┘ |
| |
└----------------------------------------------------------------------------┘

最近正在学习 Go,正好拿它来完成这个小程序,取名为 Boast 为了让我们在重现 Bug 上更为主动和方便,以及更快地修复,好多花点时间来造轮子!


Boast 项目地址


Go 和前端都是现学现卖,欢迎大家拍砖。

beego1.8的文档还没更新么?

回复

kaixinmao 发起了问题 • 1 人关注 • 0 个回复 • 792 次浏览 • 2017-03-23 10:31 • 来自相关话题