elasticsearch同步mysql数据库神器之go-mysql-elasticsearch

go-mysql-elasticsearch 是国内作者开发的一款插件。测试表明：该插件优点：能实现同步增、删、改、查操作。不足之处（待完善的地方）：1、日志不是很详细，但是能满足基本需求；2、初始化时，无法自动同步mysql中存在的以前的数据，需要自行解决初始导入（如重建索引批量导入）目前我司使用的是国防科技大学的同步组件，基本符合目前需求，该组件作为不时之需go-mys...

强哥的博客

3167人浏览 · 2019-10-08 14:31:17

强哥的博客 · 2019-10-08 14:31:17 发布

go-mysql-elasticsearch 是国内作者开发的一款插件。测试表明：该插件优点：能实现同步增、删、改、查操作。不足之处（待完善的地方）：
1、日志不是很详细，但是能满足基本需求；
2、初始化时，无法自动同步mysql中存在的以前的数据，需要自行解决初始导入（如重建索引批量导入）

go-mysql-elasticsearch 安装
步骤1：安装go
yum install go
步骤2：安装godep
go get github.com/tools/godep
步骤3：获取go-mysql-elastisearch插件
go get github.com/siddontang/go-mysql-elasticsearch
步骤4：安装go-mysql-elastisearch插件
cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch
make
go-mysql-elasticsearch 使用
1 修改配置文件 vi river.toml

MySQL address, user and password

user must have replication privilege in MySQL.

#以下为同步的mysql配置
my_addr = “127.0.0.1:3306”
my_user = “root”
my_pass = “123456”
my_charset = “utf8”

Set true when elasticsearch use https

#es_https = false

Elasticsearch address

es_addr = “192.168.100.90:9200”

Elasticsearch user and password, maybe set by shield, nginx, or x-pack

es_user = “”
es_pass = “”

Path to store data, like master.info, if not set or empty,

we must use this to support breakpoint resume syncing.

TODO: support other storage, like etcd.

data_dir = “./var”

Inner Http status address

stat_addr = “127.0.0.1:12800”

pseudo server id like a slave

server_id = 1001

mysql or mariadb

flavor = “mysql”

mysqldump execution path

if not set or empty, ignore mysqldump.

#mysqldump = “mysqldump”

if we have no privilege to use mysqldump with --master-data,

we must skip it.

#skip_master_data = false

minimal items to be inserted in one bulk

bulk_size = 128

force flush the pending requests if we don’t have enough items >= bulk_size

flush_bulk_time = “200ms”

Ignore table without primary key

skip_no_pk_table = false

MySQL data source

[[source]]
schema = “zkbh_nbjd”

Only below tables will be synced into Elasticsearch.

“t_[0-9]{4}” is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don’t think it is necessary to sync all tables in a database.

#同步的数据表列表，多个表用,隔开
tables = [“sys_user”,“sys_log”]

Below is for special rule mapping

Very simple example

desc t;

±------±-------------±-----±----±--------±------+

| Field | Type | Null | Key | Default | Extra |

±------±-------------±-----±----±--------±------+

| id | int(11) | NO | PRI | NULL | |

| name | varchar(256) | YES | | NULL | |

±------±-------------±-----±----±--------±------+

The table `t` will be synced to ES index `test` and type `t`

同步zkbh_nbjd数据库下的sys_user表数据到索引user中
[[rule]]
schema = “zkbh_nbjd”
table = “sys_user”
index = “user”
type = “novel”

Wildcard table rule, the wildcard table must be in source tables

All tables which match the wildcard format will be synced to ES index `test` and type `t`.

In this example, all tables must have same schema with above table `t`;

同步zkbh_nbjd数据库下的sys_log表数据到索引log中
[[rule]]
schema = “zkbh_nbjd”
table = “sys_log”
index = “log”
type = “log”

3.启动 go-mysql-elasticsearch

cd /root/go/src/github.com/siddontang/go-mysql-elasticsearch
nohup ./bin/go-mysql-elasticsearch -config=./etc/river.toml & 为后台启动，否则会因为登录linux的用户退出而关闭服务。此处需要引入Screen 窗口管理器来保证 go-mysql-elasticsearch服务不会关闭，具体请查看相关资料

一门面向 Data 和 AI 的低代码、云原生的开源编程语言

无需安装部署，在线快速体验 Byzer

更多推荐

byzer plugin install log

Byzer 白泽

函数实现越通用越好？来看看 Byzer-LLM 的 Function Implementation 带来的编程思想大变化...

前言Function Calling 是 OpenAI 首先提出来的一个非常有用的功能，实现了大模型对函数的调用能力。Byzer-LLM 给开源模型也带来了 Function Calling 实现。在这个基础上，我们还拓展了 Respond With Class 功能，允许大模型输出标准的Python对象，进步控制了大模型的输出能力。这两个能力参看：给开源大模型带来Function Callin.