不同版本下载 https://www.elastic.co/guide/en/elasticsearch/reference/6.5/es-release-notes.html 【本次安装参考】 http://blog.51cto.com/moerjinrong/2310817 分词安装要求分词的插件的版本与ES版本号完全一致,因此要先看一下分词的版本与ES的版本 本次安装为v6.5.0,es、ik、head https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.tar.gz https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.0/elasticsearch-analysis-ik-6.5.0.zip https://github.com/mobz/elasticsearch-head/archive/v5.0.0.tar.gz 官方文档 https://www.elastic.co/guide/index.html https://www.elastic.co/guide/en/elasticsearch/reference/6.5/release-notes-6.5.0.html ES7 Downloads: https://elastic.co/downloads/elasticsearch Release notes: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/release-notes-7.17.20.html 历史版本下载 https://www.elastic.co/downloads/past-releases#elasticsearch https://www.elastic.co/downloads/past-releases/elasticsearch-7-17-20 https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.20-linux-x86_64.tar.gz 下载
依赖JDK
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.0.tar.gz --no-check-certificate
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.0/elasticsearch-analysis-ik-6.5.0.zip --no-check-certificate
wget https://github.com/mobz/elasticsearch-head/archive/v5.0.0.tar.gz --no-check-certificate
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.20-linux-x86_64.tar.gz --no-check-certificate
|
三节点集群安装 - ES是以集群方式运行的,至少需要两个节点 - 不需要配置SSH - 依赖JDK 为docker划分一个子网段,仅限于该服务器内使用 docker network rm mydk docker network create --subnet=192.168.73.0/24 mydk docker run -itd --privileged --name es1 -h es1 --net mydk --ip 192.168.73.11 -v /opt:/opt -v /tmp:/tmp -v /mnt:/mnt -v /media:/media -p 13301:13301 cent7 /usr/sbin/init docker exec -it es1 bash ### 依赖安装 yum install -y net-tools libaio numactl yum -y install gcc gcc-c++ autoconf make yum install openssl-devel bzip2-devel docker run -itd --privileged --name es2 -h es2 --net mydk --ip 192.168.73.12 -v /opt:/opt -v /tmp:/tmp -v /mnt:/mnt -v /media:/media -p 13301:13301 cent7 /usr/sbin/init docker exec -it es2 bash docker run -itd --privileged --name es3 -h es3 --net mydk --ip 192.168.73.13 -v /opt:/opt -v /tmp:/tmp -v /mnt:/mnt -v /media:/media cent7 /usr/sbin/init docker exec -it es3 bash JDK export JAVA_HOME=/opt/app/jdk-11 export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$CLASSPATH export PATH=$JAVA_HOME/bin:$PATH |
|
节点1
mkdir -p /data/es/{app,data,logs}
rsync -rltDv /media/xt/tpf/soft/es/ /data/es/app/
cd /data/es/app/
tar -zxvf elasticsearch-6.5.0.tar.gz
mkdir /data/es/app/elasticsearch-6.5.0/plugins/ik
unzip elasticsearch-analysis-ik-6.5.0.zip -d /data/es/app/elasticsearch-6.5.0/plugins/ik
ls /data/es/app/elasticsearch-6.5.0/plugins/ik
commons-codec-1.9.jar config httpclient-4.5.2.jar plugin-descriptor.properties
commons-logging-1.2.jar elasticsearch-analysis-ik-6.5.0.jar httpcore-4.4.4.jar plugin-security.policy
|
echo " xt soft nofile 655350 xt hard nofile 655350 xt soft nproc 655350 xt hard nproc 655350 xt soft memlock -1 xt hard memlock -1 " >> /etc/security/limits.conf cat /etc/security/limits.conf ll /etc/security/limits.d/20-nproc.conf echo " xt soft nproc 655350 ">> /etc/security/limits.d/20-nproc.conf cat /etc/security/limits.d/20-nproc.conf echo " xt soft nproc 655350 ">> /etc/security/limits.d/90-nproc.conf cat /etc/security/limits.d/90-nproc.conf echo " vm.max_map_count=262144 ">> /etc/sysctl.conf sysctl -p |
|
本次在docker中安装,ssh通信失败,但es通信成功,原因未知。 各个节点执行 yum install openssh-server adduser xt su - xt ssh-keygen -t rsa 各个节点执行除本节点外的两个两个命令 ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.73.11 ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.73.12 ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.73.13 如果采用一机安装三个节点,就不需要配置互信了 猜测原因
对于ES来说,通信主要使用http端口,没有使用ssh服务,
因此不需要配置互信,可以解决通信问题。
|
一个节点配置好后,再复制到其他节点 vim /data/es/app/elasticsearch-6.5.0/config/elasticsearch.yml cluster.name: my-application node.name: node-1 path.data: /data/es/data/ path.logs: /data/es/logs/ network.host: 192.168.73.11 http.port: 9200 discovery.zen.ping.unicast.hosts: ["192.168.73.11", "192.168.73.12","192.168.73.13"] discovery.zen.minimum_master_nodes: 2
将文件复制到其他节点
mkdir -p /data/es/{app,data,logs}
scp -r xt@192.168.73.11:/data/es/app/elasticsearch-6.5.0 /data/es/app
chown -R xt.xt /data/es
其他节点对 elasticsearch.yml 修改如下
vim /data/es/app/elasticsearch-6.5.0/config/elasticsearch.yml
node.name: node-2
network.host: 192.168.73.12
rsync -rltDv /data/es/app/elasticsearch-6.5.0 /tmp/
mkdir -p /data/es/{app,data,logs}
chown -R xt.xt /data/es
rsync -rltDv /tmp/elasticsearch-6.5.0 /data/es/app/
chown -R xt.xt /data/es
|
http://192.168.73.11:9100 【后台启动】 cd /data/es/app/elasticsearch-6.5.0 nohup ./bin/elasticsearch > /data/es/logs/start.log 2>&1 & tailf /data/es/logs/start.log 或 ./bin/elasticsearch -d 第一个节点启动时会报以下信息,第二个节点启动后就好了 not enough master nodes discovered during 第二个节点启动后会有加入集群的信息,第三个节后则没有该信息,因为此配置文件中主节点个数为2 [node-2] recovered [0] indices into cluster_state 【关闭】 使用启动用户杀即可 ps -ef |grep ela kill -9 进程号 |
在浏览器中访问
http://192.168.73.11:9200/
创建一个索引
curl -XPUT http://192.168.73.11:9200/index
{"acknowledged":true,"shards_acknowledged":true,"index":"index"}
创建一个映射
curl -XPOST http://192.168.73.11:9200/index/fulltext/_mapping -H 'Content-Type:application/json' -d'
{
"properties": {
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}'
{"acknowledged":true}
索引一些文档
curl -XPOST http://192.168.73.11:9200/index/fulltext/1 -H 'Content-Type:application/json' -d'
{"content":"时间是一切财富中最宝贵的财富"}
'
curl -XPOST http://192.168.73.11:9200/index/fulltext/2 -H 'Content-Type:application/json' -d'
{"content":"世界上一成不变的东西,只有“任何事物都是在不断变化的”这条真理。"}
'
curl -XPOST http://192.168.73.11:9200/index/fulltext/3 -H 'Content-Type:application/json' -d'
{"content":"要使别人喜欢你,首先你得改变对人的态度,把精神放得轻松一点,表情自然,笑容可掬,这样别人就会对你产生喜爱的感觉了。——卡耐基"}
'
curl -XPOST http://192.168.73.11:9200/index/fulltext/4 -H 'Content-Type:application/json' -d'
{"content":"君子在下位则多谤,在上位则多誉;小人在下位则多誉,在上位则多谤。——柳宗元"}
'
curl -XPOST http://192.168.73.11:9200/index/fulltext/5 -H 'Content-Type:application/json' -d'
{"content":"一个不注意小事情的人,永远不会成功大事业。——卡耐基"}
'
{"_index":"index","_type":"fulltext","_id":"5","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":3}
查看
curl -XPOST http://192.168.73.11:9200/index/fulltext/_search?pretty -H 'Content-Type:application/json' -d'
{
"query" : { "match" : { "content" : "卡耐基" }},
"highlight" : {
"pre_tags" : ["<tag1>", "<tag2>"],
"post_tags" : ["</tag1>", "</tag2>"],
"fields" : {
"content" : {}
}
}
}
'
查询显示
{
"took" : 307,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "index",
"_type" : "fulltext",
"_id" : "5",
"_score" : 0.2876821,
"_source" : {
"content" : "一个不注意小事情的人,永远不会成功大事业。——卡耐基"
},
"highlight" : {
"content" : [
"——
|
es7可以单节点安装 wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.17.4-linux-x86_64.tar.gz --no-check-certificate wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.4/elasticsearch-analysis-ik-7.17.4.zip --no-check-certificate wget https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v7.17.4/elasticsearch-analysis-pinyin-7.17.4.zip
扩展阅读:
|
安装前面安装的es1,es2,es3
su - xt
cd /data/es/app
rsync -rltDv /tmp/es7/elasticsearch-7.17.4-linux-x86_64.tar.gz ./
tar -xvf elasticsearch-7.17.4-linux-x86_64.tar.gz
discovery.seed_hosts: 集群主机列表
cluster.initial_master_nodes: 启动时初始化的参与选主的node,生产环境必填
vim elasticsearch-7.17.4/config/elasticsearch.yml
cluster.name: my-application
node.name: node-1
path.data: /data/es/data/
path.logs: /data/es/logs
network.host: 192.168.73.11
http.port: 9200
discovery.seed_hosts: ["192.168.73.11", "192.168.73.12"]
cluster.initial_master_nodes: ["node-1", "node-2"]
./bin/elasticsearch -d
https://www.cnblogs.com/Likfees/p/16449224.html
分词器
cd elasticsearch-7.17.4/plugins/
rsync -rltDv /tmp/es7/elasticsearch-analysis-ik-7.17.4.zip ./
unzip -d ik elasticsearch-analysis-ik-7.17.4.zip
rm elasticsearch-analysis-ik-7.17.4.zip
wget https://github.com/medcl/elasticsearch-analysis-pinyin/releases/download/v7.17.4/elasticsearch-analysis-pinyin-7.17.4.zip
rsync -rltDv /tmp/es7/elasticsearch-analysis-pinyin-7.17.4.zip ./
unzip -d pinyin elasticsearch-analysis-pinyin-7.17.4.zip
rm elasticsearch-analysis-pinyin-7.17.4.zip
|
rsync -rltDv /data/es/app/elasticsearch-7.17.4 /tmp/
rsync -rltDv /tmp/elasticsearch-7.17.4 /data/es/app/
vim elasticsearch-7.17.4/config/elasticsearch.yml
cluster.name: my-application
node.name: node-2
path.data: /data/es/data/
path.logs: /data/es/logs
network.host: 192.168.73.12
http.port: 9200
discovery.seed_hosts: ["192.168.73.11", "192.168.73.12"]
cluster.initial_master_nodes: ["node-1", "node-2"]
./bin/elasticsearch -d
[xt@es2 elasticsearch-7.17.4]$ netstat -tunlp (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 192.168.73.12:9300 0.0.0.0:* LISTEN 213/java tcp 0 0 192.168.73.12:9200 0.0.0.0:* LISTEN 213/java tcp 0 0 127.0.0.11:43867 0.0.0.0:* LISTEN - udp 0 0 127.0.0.11:47013 0.0.0.0:* - [xt@es2 elasticsearch-7.17.4]$ 本次没有按下面的配置进行,集群依然起来,怀疑discovery.zen.ping.unicast.hosts是es6中的配置,es7中不需要了 cluster.name: kkb-es node.name: node-0 node.master: true network.host: 0.0.0.0 http.port: 9200 transport.tcp.port: 9300 # tcp 端口 discovery.zen.ping.unicast.hosts: ["192.168.147.66:9300","192.168.147.67:9300","192.168.147.68:9300"] discovery.zen.minimum_master_nodes: 2 http.cors.enabled: true http.cors.allow-origin: "*" |
|
后续安装直接解压,然后修改配置文件即可 tar -zcvf elasticsearch-7.17.4_ok.tar.gz elasticsearch-7.17.4/ mv elasticsearch-7.17.4_ok.tar.gz /media/xt/tpf/soft/es7/ JDK export JAVA_HOME=/opt/app/jdk-11 export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar:$CLASSPATH export PATH=$JAVA_HOME/bin:$PATH 系统配置
解压安装 mkdir /data/es cd /data/es rsync -rltDv /media/xt/tpf/soft/es7/elasticsearch-7.17.4_ok.tar.gz ./ tar -xvf elasticsearch-7.17.4_ok.tar.gz 配置
mkdir -p /data/es/data/
mkdir -p /data/es/logs/
单节点配置
vim config/elasticsearch.yml
network.host: 127.0.0.1
discovery.seed_hosts: ["127.0.0.1"]
cluster.initial_master_nodes: ["node-1"]
启动
./bin/elasticsearch -d
|
|
cat config/elasticsearch.yml cluster.name: my-application node.name: node-1 path.data: /data/es/data/ path.logs: /data/es/logs network.host: 127.0.0.1 http.port: 9200 transport.tcp.port: 9300 discovery.seed_hosts: ["127.0.0.1"] cluster.initial_master_nodes: ["node-1"]
如果IP配置为127.0.0.1就只能本地访问
如果想要外部访问,就必须配置具体的IP
- 比如windows中的ubantu系统,
- 要想在windows中访问ubantu中的es,那么es配置的IP就必须写对外的IP,比如 172.31.150.83
|
pip install elasticsearch6
创建一个索引
curl -XPUT http://192.168.73.11:9200/index
创建一个映射
curl -XPOST http://192.168.73.11:9200/index/fulltext/_mapping -H 'Content-Type:application/json' -d'
{
"properties": {
"content": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_max_word"
}
}
}'
索引一些文档
curl -XPOST http://192.168.73.11:9200/index/fulltext/1 -H 'Content-Type:application/json' -d'
{"content":"时间是一切财富中最宝贵的财富"}
'
curl -XPOST http://192.168.73.11:9200/index/fulltext/2 -H 'Content-Type:application/json' -d'
{"content":"世界上一成不变的东西,只有“任何事物都是在不断变化的”这条真理。"}
'
curl -XPOST http://192.168.73.11:9200/index/fulltext/3 -H 'Content-Type:application/json' -d'
{"content":"要使别人喜欢你,首先你得改变对人的态度,把精神放得轻松一点,表情自然,笑容可掬,这样别人就会对你产生喜爱的感觉了。——卡耐基"}
'
python检索
es = Elasticsearch('http://192.168.73.11:9200')
# 索引名称
index_name = 'index'
# 执行一个简单的搜索请求
response = es.search(
index=index_name,
body={
"query": {
"match_all": {}
}
}
)
# 打印搜索结果
print(response['hits']['hits'])
# 关闭与Elasticsearch的连接
# es.close()
插入一个索引
from elasticsearch6 import Elasticsearch
import datetime
# 初始化Elasticsearch客户端
es = Elasticsearch([{'host': '192.168.73.11', 'port': 9200}])
# 创建索引
index_name = "index2"
if not es.indices.exists(index=index_name):
es.indices.create(index=index_name)
# 插入数据
doc_id = "2"
doc_body = {"name": "张三", "age": 30, "email": "aaazhnag@example.com", "created_at": datetime.datetime.utcnow()}
response = es.index(index=index_name, id=doc_id, body=doc_body,doc_type="_doc")
# 输出响应
print(response)
|
pip install elasticsearch7
from elasticsearch7 import Elasticsearch, helpers
# 1. 创建Elasticsearch连接
es = Elasticsearch(
hosts=['http://127.0.0.1:9200'], # 服务地址与端口
http_auth=("elastic", "aaa"), # 用户名,密码
)
# 2. 定义索引名称
index_name = "index"
# 3. 如果索引已存在,删除它(仅供演示,实际应用时不需要这步)
if es.indices.exists(index=index_name):
es.indices.delete(index=index_name)
# 4. 创建索引
es.indices.create(index=index_name)
# 5. 灌库指令
actions = [
{
"_index": index_name,
"_source": {
"keywords": to_keywords(para),
"text": para
}
}
for para in [
"今天天气不错",]
]
# 6. 文本灌库
helpers.bulk(es, actions)
from elasticsearch import Elasticsearch, Requirements
requirements = Requirements(
[Requirements.XpackSecurity if (es.info['security']['version'].startswith('7.') or es.info['security']['version'].startswith('8.')) else 'none']
)
es = Elasticsearch(
'https://localhost:9200',
basic_auth=('user', 'passwd'),
requirements=requirements,
verify_certificates=False, # 如果不想验证SSL证书,可以设置为False
)
|
from elasticsearch7 import Elasticsearch, helpers
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
import nltk
import re
import warnings
warnings.simplefilter("ignore") # 屏蔽 ES 的一些Warnings
def to_keywords(input_string):
'''(英文)文本只保留关键字'''
# 使用正则表达式替换所有非字母数字的字符为空格
no_symbols = re.sub(r'[^a-zA-Z0-9\s]', ' ', input_string)
word_tokens = word_tokenize(no_symbols)
# 加载停用词表
stop_words = set(stopwords.words('english'))
ps = PorterStemmer()
# 去停用词,取词根
filtered_sentence = [ps.stem(w)
for w in word_tokens if not w.lower() in stop_words]
return ' '.join(filtered_sentence)
# 1. 创建Elasticsearch连接
es = Elasticsearch(
hosts=['http://192.168.73.11:9200'], # 服务地址与端口
verify_certificates=False
# http_auth=("elastic", "aaa"), # 用户名,密码
)
# 2. 定义索引名称
index_name = "index"
# 3. 如果索引已存在,删除它(仅供演示,实际应用时不需要这步)
if es.indices.exists(index=index_name):
es.indices.delete(index=index_name)
# 4. 创建索引
es.indices.create(index=index_name)
# 5. 灌库指令
actions = [
{
"_index": index_name,
"_source": {
"keywords": to_keywords(para),
"text": para
}
}
for para in [
"今天天气不错",]
]
# 6. 文本灌库
helpers.bulk(es, actions)
|
|
|
|
|
sklearn2pmml github
PMML讲解及使用