百度360必应搜狗淘宝本站头条
当前位置:网站首页 > 热门文章 > 正文

Hadoop 3.1.1高可用(HA)集群安装笔记

bigegpt 2024-09-12 11:18 60 浏览

环境准备

1.服务器概览

hostnameip说明nn01192.168.56.101name nodenn02192.168.56.102name nodedn01192.168.56.103data nodedn02192.168.56.104data nodedn03192.168.56.105data nodenn01nn02dn01dn02dn03NameNode√√DataNode√√√ResourceManager√√NodeManager√√√√√Zookeeper√√√√√journalnode√√√√√zkfc√√

分别在三台服务器上执行以下命令

#添加host
[root@nn01 ~] vim /etc/hosts
192.168.56.101 nn01
192.168.56.102 nn02
192.168.56.103 dn01
192.168.56.104 dn02
192.168.56.105 dn03
#执行以下命令关闭防火墙
[root@nn01 ~]systemctl stop firewalld && systemctl disable firewalld
[root@nn01 ~]setenforce 0
#将SELINUX的值改成disabled
[root@nn01 ~]vim /etc/selinux/config
SELINUX=disabled
#重启服务器
[root@nn01 ~]reboot

2.JDK安装

#配置环境变量
[root@nn01 ~]# vim /etc/profile
# 在最后下添加
# Java Environment Path
export JAVA_HOME=/opt/java/jdk1.8.0_172
export PATH=$JAVA_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
# 刷新配置文件
source /etc/profile

3.配置免密码登录

#nn01执行以下命令
#生成密钥Pair,输入之后一直选择enter即可。生成的秘钥位于 ~/.ssh文件夹下
[root@nn01 ~]# ssh-keygen -t rsa 
[root@nn01 .ssh]# scp /root/.ssh/id_rsa.pub root@nn01:~
[root@nn01 .ssh]#cat ~/id_rsa.pub >> /root/.ssh/authorized_keys
##nn02 执行以下命令
[root@nn02 .ssh]#cat ~/id_rsa.pub >> /root/.ssh/authorized_keys
##nn02,dn01,dn02,dn03 执行以下命令
[root@nn02 ~]# mkdir -p ~/.ssh
[root@nn02 ~]# cd .ssh/
[root@nn02 .ssh]# cat ~/id_rsa.pub >> /root/.ssh/authorized_keys
[root@nn02 .ssh]# vim /etc/ssh/sshd_config
#禁用root账户登录,如果是用root用户登录请开启
PermitRootLogin yes
PubkeyAuthentication yes

要求能通过免登录包括使用IP和主机名都能免密码登录:1) NameNode能免密码登录所有的DataNode2) 各NameNode能免密码登录自己3) 各NameNode间能免密码互登录4) DataNode能免密码登录自己5) DataNode不需要配置免密码登录NameNode和其它DataNode。


本文来自 toto1297488504 的CSDN 博客 ,全文地址请点击:https://blog.csdn.net/tototuzuoquan/article/details/72983527?utm_source=copy

同理,配置nn02免密码登录nn01,dn01,dn02,dn03

安装zookeeper

mkdir -p /opt/zookeeper/
cd /opt/zookeeper/
tar -zxvf zookeeper-3.4.13.tar.gz
cd zookeeper-3.4.13/conf/
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg 

zoo.cfg

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=nn01:2888:3888
server.2=nn02:2888:3888
server.3=dn01:2888:3888
server.4=dn02:2888:3888
server.5=dn03:2888:3888

基本配置:

tickTime

心跳基本时间单位,毫秒级,ZK基本上所有的时间都是这个时间的整数倍。

initLimit

tickTime的个数,表示在leader选举结束后,followers与leader同步需要的时间,如果followers比较多或者说leader的数据灰常多时,同步时间相应可能会增加,那么这个值也需要相应增加。当然,这个值也是follower和observer在开始同步leader的数据时的最大等待时间(setSoTimeout)

syncLimit

tickTime的个数,这时间容易和上面的时间混淆,它也表示follower和observer与leader交互时的最大等待时间,只不过是在与leader同步完毕之后,进入正常请求转发或ping等消息交互时的超时时间。

dataDir

内存数据库快照存放地址,如果没有指定事务日志存放地址(dataLogDir),默认也是存放在这个路径下,建议两个地址分开存放到不同的设备上。

clientPort

配置ZK监听客户端连接的端口

server.serverid=host:tickpot:electionport

server:固定写法

serverid:每个服务器的指定ID(必须处于1-255之间,必须每一台机器不能重复)

host:主机名

tickpot:心跳通信端口

electionport:选举端口

#新建文件夹
mkdir -p /opt/data/zookeeper
mkdir -p /opt/data/logs/zookeeper
touch /opt/data/zookeeper/myid
#复制到其他主机
scp -r /opt/zookeeper root@nn02:/opt/
scp -r /opt/data/zookeeper root@nn02:/opt/data/
scp -r /opt/data/logs/zookeeper root@nn02:/opt/data/logs/
#在nn01上执行
echo 1 > /opt/data/zookeeper/myid
#在nn02上执行
echo 2 > /opt/data/zookeeper/myid
#在dn01上执行
echo 3 > /opt/data/zookeeper/myid
#在dn02上执行
echo 4 > /opt/data/zookeeper/myid
#在dn03上执行
echo 5 > /opt/data/zookeeper/myid
#添加环境变量
export ZOOKEEPER_HOME=/opt/zookeeper/zookeeper-3.4.13
export PATH=$ZOOKEEPER_HOME/bin:$PATH
source /etc/profile

安装hadoop

1 下载hadoop

mkdir -p /opt/hadoop/
cd /opt/hadoop
tar -xf hadoop-3.1.1.tar.gz
##设置环境变量
export HADOOP_HOME=/opt/hadoop/hadoop-3.1.1
export HADOOP_PREFIX=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME
# 新建文件夹
mkdir -p /opt/data/logs/hadoop
mkdir -p /opt/data/hadoop/hdfs/nn
mkdir -p /opt/data/hadoop/hdfs/dn
mkdir -p /opt/data/hadoop/hdfs/jn

修改配置文件:/opt/hadoop/hadoop-3.1.1/etc/hadoop/hadoop-env.sh

## 在文件开头加上,根据自己服务器配置设置jvm内存大小
export JAVA_HOME=/opt/java/jdk1.8.0_172
export HADOOP_NAMENODE_OPTS=" -Xms1024m -Xmx1024m -XX:+UseParallelGC"
export HADOOP_DATANODE_OPTS=" -Xms512m -Xmx512m"
export HADOOP_LOG_DIR=/opt/data/logs/hadoop

/opt/hadoop/hadoop-3.1.1/etc/hadoop/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
 <!-- 指定hdfs的nameservice为mycluster -->
 <property>
 <name>fs.defaultFS</name>
 <value>hdfs://mycluster</value>
 </property>
 <property>
 <name>hadoop.tmp.dir</name>
 <value>/opt/data/hadoop/tmp</value>
 </property>
 <!-- 指定zookeeper地址 -->
<property>
 <name>ha.zookeeper.quorum</name>
 <value>nn01:2181,nn02:2181,dn01:2181,dn02:2181,dn03:2181</value>
</property>
<!-- hadoop链接zookeeper的超时时长设置 -->
 <property>
 <name>ha.zookeeper.session-timeout.ms</name>
 <value>30000</value>
 <description>ms</description>
 </property>
 <property>
 <name>fs.trash.interval</name>
 <value>1440</value>
 </property>
</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- journalnode集群之间通信的超时时间 -->
<property>
 <name>dfs.qjournal.start-segment.timeout.ms</name>
 <value>60000</value>
</property>
 <!--指定hdfs的nameservice为mycluster,需要和core-site.xml中的保持一致
 dfs.ha.namenodes.[nameservice id]为在nameservice中的每一个NameNode设置唯一标示符。
 配置一个逗号分隔的NameNode ID列表。这将是被DataNode识别为所有的NameNode。
 例如,如果使用"mycluster"作为nameservice ID,并且使用"nn01"和"nn02"作为NameNodes标示符
 -->
 <property>
 <name>dfs.nameservices</name>
 <value>mycluster</value>
 </property>
 <!-- mycluster下面有两个NameNode,分别是nn01,nn02 -->
 <property>
 <name>dfs.ha.namenodes.mycluster</name>
 <value>nn01,nn02</value>
 </property>
 <!-- nn01的RPC通信地址 -->
 <property>
 <name>dfs.namenode.rpc-address.mycluster.nn01</name>
 <value>nn01:8020</value>
 </property>
 <!-- nn02的RPC通信地址 -->
 <property>
 <name>dfs.namenode.rpc-address.mycluster.nn02</name>
 <value>nn02:8020</value>
 </property>
 <!-- nn01的http通信地址 -->
 <property>
 <name>dfs.namenode.http-address.mycluster.nn01</name>
 <value>nn01:50070</value>
 </property>
 <!-- nn02的http通信地址 -->
 <property>
 <name>dfs.namenode.http-address.mycluster.nn02</name>
 <value>nn02:50070</value>
 </property>
 <!-- 指定NameNode的edits元数据的共享存储位置。也就是JournalNode列表
 该url的配置格式:qjournal://host1:port1;host2:port2;host3:port3/journalId
 journalId推荐使用nameservice,默认端口号是:8485 -->
 <property>
 <name>dfs.namenode.shared.edits.dir</name>
 <value>qjournal://nn01:8485;nn02:8485;dn01:8485;dn02:8485;dn03:8485/mycluster</value>
 </property>
 <!-- 配置失败自动切换实现方式 -->
 <property>
 <name>dfs.client.failover.proxy.provider.mycluster</name>
 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
 </property>
 <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行 -->
 <property>
 <name>dfs.ha.fencing.methods</name>
 <value>
 sshfence
	 shell(/bin/true)
 </value>
 </property>
 <property>
 <name>dfs.permissions.enabled</name>
 <value>false</value>
 </property>
 <property>
 <name>dfs.support.append</name>
 <value>true</value>
 </property>
 <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
 <property>
 <name>dfs.ha.fencing.ssh.private-key-files</name>
 <value>/root/.ssh/id_rsa</value>
 </property>
 <!-- 指定副本数 -->
 <property>
 <name>dfs.replication</name>
 <value>2</value>
 </property>
 <property>
 <name>dfs.namenode.name.dir</name>
 <value>/opt/data/hadoop/hdfs/nn</value>
 </property>
 <property>
 <name>dfs.datanode.data.dir</name>
 <value>/opt/data/hadoop/hdfs/dn</value>
 </property>
 <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
 <property>
 <name>dfs.journalnode.edits.dir</name>
 <value>/opt/data/hadoop/hdfs/jn</value>
 </property>
 <!-- 开启NameNode失败自动切换 -->
 <property>
 <name>dfs.ha.automatic-failover.enabled</name>
 <value>true</value>
 </property>
 <!-- 启用webhdfs -->
 <property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
 </property>
 <!-- 配置sshfence隔离机制超时时间 -->
 <property>
 <name>dfs.ha.fencing.ssh.connect-timeout</name>
 <value>30000</value>
 </property>
 <property>
 <name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
 <value>60000</value>
 </property>
</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
 <!-- 指定mr框架为yarn方式 -->
 <property>
 <name>mapreduce.framework.name</name>
 <value>yarn</value>
 </property>
 <!-- 指定mapreduce jobhistory地址 -->
 <property>
 <name>mapreduce.jobhistory.address</name>
 <value>nn01:10020</value>
 </property>
 <!-- 任务历史服务器的web地址 -->
 <property>
 <name>mapreduce.jobhistory.webapp.address</name>
 <value>nn01:19888</value>
 </property>
 <property>
 <name>mapreduce.application.classpath</name>
 <value>
 /opt/hadoop/hadoop-3.1.1/etc/hadoop,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/common/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/common/lib/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/lib/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/lib/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/*,
 /opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/lib/*
 </value>
 </property>
</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/yarn-site.xml

<?xml version="1.0"?>
<!--
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
 <!-- 开启RM高可用 -->
 <property>
 <name>yarn.resourcemanager.ha.enabled</name>
 <value>true</value>
 </property>
 <!-- 指定RM的cluster id -->
 <property>
 <name>yarn.resourcemanager.cluster-id</name>
 <value>yrc</value>
 </property>
 <!-- 指定RM的名字 -->
 <property>
 <name>yarn.resourcemanager.ha.rm-ids</name>
 <value>rm1,rm2</value>
 </property>
 <!-- 分别指定RM的地址 -->
 <property>
 <name>yarn.resourcemanager.hostname.rm1</name>
 <value>nn01</value>
 </property>
 <property>
 <name>yarn.resourcemanager.hostname.rm2</name>
 <value>nn02</value>
 </property>
 <!-- 指定zk集群地址 -->
 <property>
 <name>yarn.resourcemanager.zk-address</name>
 <value>nn01:2181,nn02:2181,dn01:2181,dn02:2181,dn03:2181</value>
 </property>
 <property>
 <name>yarn.nodemanager.aux-services</name>
 <value>mapreduce_shuffle</value>
 </property>
 <property>
 <name>yarn.log-aggregation-enable</name>
 <value>true</value>
 </property>
 <property>
 <name>yarn.log-aggregation.retain-seconds</name>
 <value>86400</value>
 </property>
 <!-- 启用自动恢复 -->
 <property>
 <name>yarn.resourcemanager.recovery.enabled</name>
 <value>true</value>
 </property>
 <!-- 制定resourcemanager的状态信息存储在zookeeper集群上 -->
 <property>
 <name>yarn.resourcemanager.store.class</name>
 <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
 </property>
</configuration>

/opt/hadoop/hadoop-3.1.1/etc/hadoop/workers

dn01
dn02
dn03

/opt/hadoop/hadoop-3.1.1/sbin/start-dfs.sh sbin/stop-dfs.sh

HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_ZKFC_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root

/opt/hadoop/hadoop-3.1.1/sbin/start-yarn.sh sbin/stop-yarn.sh

YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn 
YARN_NODEMANAGER_USER=root

复制到其他机器

scp -r /opt/data root@nn02:/opt/
scp -r /opt/data root@dn01:/opt/
scp -r /opt/data root@dn02:/opt/
scp -r /opt/data root@dn03:/opt/
scp -r /opt/hadoop/hadoop-3.1.1 root@nn02:/opt/hadoop/
scp -r /opt/hadoop/hadoop-3.1.1 root@dn01:/opt/hadoop/
scp -r /opt/hadoop/hadoop-3.1.1 root@dn02:/opt/hadoop/
scp -r /opt/hadoop/hadoop-3.1.1 root@dn03:/opt/hadoop/

启动

Zookeeper -> JournalNode -> 格式化NameNode ->创建命名空间(zkfc) -> NameNode -> DataNode -> ResourceManager -> NodeManager。

1. 启动zookeeper

nn01,nn02,dn01,dn02,dn03

zkServer.sh start

2. 启动journalnode

nn01,nn02,dn01,dn02,dn03

hadoop-daemon.sh start journalnode

3. 格式化namenode

nn01

hadoop namenode -format

把在nn01节点上生成的元数据给复制到其他节点上

scp -r /opt/data/hadoop/hdfs/nn/* root@nn02:/opt/data/hadoop/hdfs/nn/
scp -r /opt/data/hadoop/hdfs/nn/* root@dn01:/opt/data/hadoop/hdfs/nn/
scp -r /opt/data/hadoop/hdfs/nn/* root@dn02:/opt/data/hadoop/hdfs/nn/
scp -r /opt/data/hadoop/hdfs/nn/* root@dn03:/opt/data/hadoop/hdfs/nn/

4. 格式化zkfc

重点强调:只能在nameonde节点进行 nn01

hdfs zkfc -formatZK

5. 启动HDFS

重点强调:只能在nameonde节点进行 nn01

start-dfs.sh

6. 启动YARN

在主备 resourcemanager 中随便选择一台进行启动

nn02

start-yarn.sh

若备用节点的 resourcemanager 没有启动起来,则手动启动起来: yarn-daemon.sh start resourcemanager

7. 启动 mapreduce 任务历史服务器

mr-jobhistory-daemon.sh start historyserver

8. 状态查看

查看各主节点的状态

hdfs haadmin -getServiceState nn01
hdfs haadmin -getServiceState nn02
[root@nn01 hadoop]# hdfs haadmin -getServiceState nn01
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:06:58,892 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]# hdfs haadmin -getServiceState nn02
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:07:02,217 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[root@nn01 hadoop]#
[root@nn01 hadoop]# yarn rmadmin -getServiceState rm1
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:07:45,112 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]#
[root@nn01 hadoop]# yarn rmadmin -getServiceState rm2
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
2018-09-27 11:07:48,350 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[root@nn01 hadoop]#

WEB界面进行查看

##HDFS
http://192.168.56.101:50070/
http://192.168.56.102:50070/
#YARN
http://192.168.56.102:8088/cluster


相关推荐

得物可观测平台架构升级:基于GreptimeDB的全新监控体系实践

一、摘要在前端可观测分析场景中,需要实时观测并处理多地、多环境的运行情况,以保障Web应用和移动端的可用性与性能。传统方案往往依赖代理Agent→消息队列→流计算引擎→OLAP存储...

warm-flow新春版:网关直连和流程图重构

本期主要解决了网关直连和流程图重构,可以自此之后可支持各种复杂的网关混合、多网关直连使用。-新增Ruoyi-Vue-Plus优秀开源集成案例更新日志[feat]导入、导出和保存等新增json格式支持...

扣子空间体验报告

在数字化时代,智能工具的应用正不断拓展到我们工作和生活的各个角落。从任务规划到项目执行,再到任务管理,作者深入探讨了这款工具在不同场景下的表现和潜力。通过具体的应用实例,文章展示了扣子空间如何帮助用户...

spider-flow:开源的可视化方式定义爬虫方案

spider-flow简介spider-flow是一个爬虫平台,以可视化推拽方式定义爬取流程,无需代码即可实现一个爬虫服务。spider-flow特性支持css选择器、正则提取支持JSON/XML格式...

solon-flow 你好世界!

solon-flow是一个基础级的流处理引擎(可用于业务规则、决策处理、计算编排、流程审批等......)。提供有“开放式”驱动定制支持,像jdbc有mysql或pgsql等驱动,可...

新一代开源爬虫平台:SpiderFlow

SpiderFlow:新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。-精选真开源,释放新价值。概览Spider-Flow是一个开源的、面向所有用户的Web端爬虫构建平台,它使用Ja...

通过 SQL 训练机器学习模型的引擎

关注薪资待遇的同学应该知道,机器学习相关的岗位工资普遍偏高啊。同时随着各种通用机器学习框架的出现,机器学习的门槛也在逐渐降低,训练一个简单的机器学习模型变得不那么难。但是不得不承认对于一些数据相关的工...

鼠须管输入法rime for Mac

鼠须管输入法forMac是一款十分新颖的跨平台输入法软件,全名是中州韵输入法引擎,鼠须管输入法mac版不仅仅是一个输入法,而是一个输入法算法框架。Rime的基础架构十分精良,一套算法支持了拼音、...

Go语言 1.20 版本正式发布:新版详细介绍

Go1.20简介最新的Go版本1.20在Go1.19发布六个月后发布。它的大部分更改都在工具链、运行时和库的实现中。一如既往,该版本保持了Go1的兼容性承诺。我们期望几乎所...

iOS 10平台SpriteKit新特性之Tile Maps(上)

简介苹果公司在WWDC2016大会上向人们展示了一大批新的好东西。其中之一就是SpriteKitTileEditor。这款工具易于上手,而且看起来速度特别快。在本教程中,你将了解关于TileE...

程序员简历例句—范例Java、Python、C++模板

个人简介通用简介:有良好的代码风格,通过添加注释提高代码可读性,注重代码质量,研读过XXX,XXX等多个开源项目源码从而学习增强代码的健壮性与扩展性。具备良好的代码编程习惯及文档编写能力,参与多个高...

Telerik UI for iOS Q3 2015正式发布

近日,TelerikUIforiOS正式发布了Q32015。新版本新增对XCode7、Swift2.0和iOS9的支持,同时还新增了对数轴、不连续的日期时间轴等;改进TKDataPoin...

ios使用ijkplayer+nginx进行视频直播

上两节,我们讲到使用nginx和ngixn的rtmp模块搭建直播的服务器,接着我们讲解了在Android使用ijkplayer来作为我们的视频直播播放器,整个过程中,需要注意的就是ijlplayer编...

IOS技术分享|iOS快速生成开发文档(一)

前言对于开发人员而言,文档的作用不言而喻。文档不仅可以提高软件开发效率,还能便于以后的软件开发、使用和维护。本文主要讲述Objective-C快速生成开发文档工具appledoc。简介apple...

macOS下配置VS Code C++开发环境

本文介绍在苹果macOS操作系统下,配置VisualStudioCode的C/C++开发环境的过程,本环境使用Clang/LLVM编译器和调试器。一、前置条件本文默认前置条件是,您的开发设备已...