当前位置：网站首页 > 技术文章 > 正文

Ambari大数据平台搭建(Centos7)（apache 大数据项目）

nanshan 2024-11-06 11:12 14 浏览 0 评论

Centos7在线搭建Ambari大数据平台

安装前准备

1）配置hostname

#vi /etc/hostname
bigdata-1

注意:其他机器要更改序号bigdata-2......

配置hosts文件

#vi /etc/hosts

127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.154.132 bigdata-1

192.168.154.133 bigdata-2

192.168.154.134 bigdata-3

以上两项配置完成，重启生效。

关闭防火墙

#systemctl stop firewalld.service
#systemctl disable firewalld.service
查看状态
#firewall-cmd --state

4）Centos服务器ssh免密设置

Ambari 的 Server 会 SSH 到 Agent 的机器，拷贝并执行一些命令。因此我们需要配置 Ambari Server 到 Agent 的 SSH 无密码登录。

在Ambari Server节点执行如下命令：

先在所有节点执行以下命令

#ssh-keygen -t rsa （回车后再按连续三个回车）

在bigdata-1进行

#cd .ssh/

#cat id_rsa.pub >> authorized_keys

然后将authorized_keys拷贝到其他2台机器的.ssh目录，如下：

#scp authorized_keys bigdata-2:/root/.ssh/

在bigdata-2节点执行

#cd .ssh/

#cat id_rsa.pub >> authorized_keys

#scp authorized_keys bigdata-3:/root/.ssh/

超过三个节点，依此类推。

验证:

在bigdata-1上执行命令：

#ssh bigdata-2 date; ssh bigdata-3 date;

5）确保 Yum 可以正常工作

通过公共库（public repository），安装 Hadoop 这些软件，背后其实就是应用 Yum 在安装公共库里面的 rpm 包。所以这里需要您的机器都能访问 Internet

3）获取Ambari公共库文件

需要获取 Ambari 的公共库文件（public repository）。登录到 Linux 主机并执行下面的命令（也可以自己手工下载）

cd /etc/yum.repos.d/

wget http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.4.0/ambari.repo

安装ambari server

我们需要获取该公共库的所有的源文件列表。依次执行以下命令。

#yum clean all

#yum list|grep ambari

1）安装ambari server

#yum install ambari-server

2）配置ambari server

#ambari-server setup

3）启动ambari server

#ambari-server start

配置安装大数据组件

当成功启动 Ambari Server 之后，便可以从浏览器登录，默认的端口为 8080。以本文环境为例，在浏览器的地址栏输入 http://192.168.154.132:8080，登录密码为 admin/admin。登入 Ambari 之后的页面如下图。

点击按钮“LAUNCH INSTALL WIZARD”，就可以开始创建属于自己的大数据平台。

1）命名集群

本环境为 bigdata

2）选择一个 Stack

这个 Stack 相当于一个 Hadoop 生态圈软件的集合。Stack 的版本越高，里面的软件版本也就越高。这里我们选择 HDP3.0，由于操作系统是centos7.4，因此选择redhat7，其他的都remove掉。

3）指定 Agent 机器

本文环境机器名为： bigdata-1,bigdata-2,bigdata-3.这些机器会被安装 Hadoop 等软件包。这里需要指定当时在 Ambari Server 机器生成的私钥（ssh-keygen 生成的，公钥已经拷贝到 Ambari Agent 的机器）。另外不要选择“Perform manual registration on hosts and do not use SSH“。因为我们需要 Ambari Server 自动去安装 Ambari Agent。

警告提示域名不规范，点击【继续】按钮。

安装 Ambari Agent

Ambari Server 会自动安装 Ambari Agent 到刚才指定的机器列表。安装完成后，Agent 会向 Ambari Server 注册。成功注册后，就可以继续 Next 到下一步。

选择要安装的组件

在这一步，我们需要选择要安装的软件名称。选的越多，就会需要越多的机器内存。选择之后就可以继续下一步了。这里需要注意某些 Service 是有依赖关系的。如果您选了一个需要依赖其他 Service 的一个 Service，Ambari 会提醒安装对应依赖的 Service

为了节省资源，仅选择近期可能用到的组件，后续如果需要其他组件，可以再补充安装。

告警提示没有选择Ranger等，点击【继续处理】按钮

选择Master机器

这里使用默认选择即可

选择Slaves and Clients

资源有限，因此本文环境全部选ALL。

组件配置

为了便于记忆，密码全设置成bigdata

数据目录选择单独磁盘目录，或根据实际情况选择空间大的目录分区。虚机的根目录若是空间比较小，若/home分区比较大，建议再根分区创建/data软连接到空间大分区（目录配置限制不能直接配置/home目录）。

配置汇总：

检查：

安装组件

Ambari 会开始安装选择的 Service 到 Ambari Agent 的机器。这里可能需要等好一会，因为都是在线安装。安装完成之后，Ambari 就会启动这些 Service。

Ambari 的 Dashboard

安装完成之后，就可以查看 Ambari 的 Dashboard 了

Ambari集成OpenTSDB

Ambari集成OpenTSDB有以下两种方式：

1）手动安装

步骤1：下载ambari-opentsdb-service

#git clone https://github.com/hortonworks-gallery/ambari-opentsdb-service.git /var/lib/ambari-server/resources/stacks/HDP/3.0/services/OPENTSDB

步骤2：修改master.py

编辑/var/lib/ambari-server/resources/stacks/HDP/3.0/services/OPENTSDB/package/scripts/master.py

注释掉如下几行：

步骤3：修改metainfo.xml

编辑/var/lib/ambari-server/resources/stacks/HDP/3.0/services/OPENTSDB/metainfo.xml

步骤4：修改opentsdb-config.xml和README.md

步骤5：重启ambari server和agent

#ambari-server restart

#ambari-agent restart

步骤6：下载opentsdb-2.4.0.tar.gz 并处理安装目录

#wget https://github.com/OpenTSDB/opentsdb/releases/download/v2.4.0/opentsdb-2.4.0.tar.gz -O /tmp/opentsdb.tar.gz

#mkdir opentsdb

#tar -zxvf /tmp/opentsdb.tar.gz -C /root/opentsdb

#mv /root/opentsdb/*/* /root/opentsdb

#mv /root/opentsdb/opentsdb-2.4.0/.git /root/opentsdb/

#cd /root/opentsdb

#mkdir build

#cp -r third_party ./build

步骤7：ambari的web上add service

一直next直到deploy，然后安装成功。

2）自动安装

步骤1：下载ambari-opentsdb-service

#git clone https://github.com/hardybird/ambari-opentsdb-service.git /var/lib/ambari-server/resources/stacks/HDP/3.0/services/OPENTSDB

步骤2：重启ambari server和agent

#ambari-server restart

#ambari-agent restart

步骤4：ambari的web上add service

安装遇到问题及解决方案

1）hive安装链接数据库连接不上

在ambari-server所在的节点上运行

#ambari-serversetup --jdbc-db=postgres --jdbc-driver=/usr/share/java/postgresql-jdbc.jar

切换用户到数据库模式下

[root@node2 yum.repos.d]# su - postgres

进入数据库命令行模式

-bash-4.2$ psql

psql (9.2.13)

Type "help" for help.

postgres=#

postgres=# \l 查看数据库命令，暂时还没有安装hive所需的hive数据库

postgres=# create database hive; 创建hive所需的hive数据库

CREATE DATABASE

postgres=# \l 已创建hive所需的hive数据库

postgres=#

postgres=# create user hive; 创建hive所需的hive用户

CREATE ROLE

为了能够顺利连接上数据库还需要修改配置文件/var/lib/pgsql/data/pg_hba.conf

# TYPE DATABASE USER ADDRESS METHOD

# "local" is for Unix domain socket connections only

local all all peer

# IPv4 local connections:

host all all 127.0.0.1/32 ident

# IPv6 local connections:

host all all ::1/128 ident

加上红色的这一句，无需密码和ssl链接

hostnossl hive hive hiveserverip/32 trust

# Allow replication connections from localhost, by a user with the

# replication privilege.

#local replication postgres peer

#host replication postgres 127.0.0.1/32 ident

#host replication postgres ::1/128 ident

修改配置完毕，重启postgresql服务

-bash-4.2$ pg_ctl -m fast stop

-bash-4.2$ pg_ctl restart

在web上的hive配置页面测试一下hive数据库是否能正确连接

2）YARN REGISTRY DNS 启动失败

看日志是53端口被占用，可以通过 lsof -i:53 查看个进程占用了53端口，发现是dnsmasq这个工具占用。通过如下命令停止并禁用dnsmasq：

#systemctl stop dnsmasq

#systemctl disable dnsmasq

3）集成airflow时报各种问题时可采取的手段。

问题1、RuntimeError: Failed to execute command '/usr/bin/yum -y install python-pip', exited with code '1', message: '错误：无须任何处理

yum -y install epel-release

问题2、resource_management.core.exceptions.ExecutionFailed: Execution of 'pip install --upgrade docutils pytest-runner Cython==0.28' returned 1. Collecting docutils

pip install --upgrade docutils==0.15 pytest-runner Cython==0.28 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

问题3、resource_management.core.exceptions.ExecutionFailed: Execution of 'export SLUGIFY_USES_TEXT_UNIDECODE=yes && pip install --upgrade --ignore-installed apache-airflow[all]==1.10.0' returned 1. DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support

yum -y install epel-release

yum -y install numpy openldap-devel gpm-devel dbus-devel dbus-glib-devel dbus-python

pip install --upgrade docutils pytest-runner Cython==0.28 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

pip install --ignore-installed dogtag-pki

pip install --ignore-installed pyldap

pip install --ignore-installed dbus-python

pip install --ignore-installed dnspython

问题4、Building wheel for JPype1 (setup.py): finished with status 'error'

pip install wheel JPype1==0.7.0 marshmallow-sqlalchemy==0.18.0 marshmallow==2.18.0 SQLAlchemy==1.1.18

问题5、ERROR: Cannot uninstall 'numpy'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

pip install --upgrade --ignore-installed numpy==1.16.6

export SLUGIFY_USES_TEXT_UNIDECODE=yes && pip install --upgrade apache-airflow[all]==1.10.0

export SLUGIFY_USES_TEXT_UNIDECODE=yes && pip install --upgrade apache-airflow[celery]==1.10.0

pip uninstall -y JPype1 flask-login flask jinja2 werkzeug tzlocal SQLAlchemy marshmallow-sqlalchemy marshmallow flask-appbuilder flask-jwt-extended click docutils msrest azure-datalake-store thrift boto3 botocore

pip install --ignore-installed flask-login==0.2.11 flask==0.12.4 jinja2==2.8 werkzeug==0.14.1 tzlocal==1.5.1 marshmallow-sqlalchemy==0.18.0 marshmallow==2.18.0 SQLAlchemy==1.1.18 click==6.7 flask-appbuilder==1.11.1 flask-jwt-extended==3.19.0 docutils==0.15 msrest==0.6.10 azure-datalake-store==0.0.48 thrift==0.9.3 boto3==1.4.4 botocore==1.5.0

export SLUGIFY_USES_TEXT_UNIDECODE=yes && pip install --upgrade JPype1==0.7.0 flask-login==0.2.11 flask==0.12.4 jinja2==2.8 werkzeug==0.14.1 tzlocal==1.5.1 marshmallow-sqlalchemy==0.18.0 marshmallow==2.18.0 SQLAlchemy==1.1.18 click==6.7 flask-appbuilder==1.11.1 flask-jwt-extended==3.19.0 docutils==0.15 msrest==0.6.10 azure-datalake-store==0.0.48 thrift==0.9.3 boto3==1.4.4 botocore==1.5.0 --ignore-installed apache-airflow[all]==1.10.0

4）集成Elasticsearch时，集群启动异常，报找不到主节点。

Master node和data node不能安装在同一个节点上，要想节点既是master节点又可作为data节点的话，配置时，配置masters_also_are_datanodes为true即可。

另外：设置node.master: true后，从界面restart可能会覆盖配置，要在各个节点root用户手工从命令行执行重启操作: service elasticsearch restart

5）集成flink1.9.1报esource_management.core.exceptions.Fail: User 'flink' doesn't exist

每个节点执行如下操作：

groupadd flink

useradd -g flink -d /home/flink flink

集成flink1.9.1配置问题

flink启动失败

问题描述：resource_management.core.exceptions.ExecutionFailed: Execution of 'yarn application -list 2>/dev/null | awk '/flinkapp-from-ambari/ {print $1}' | head -n1 > /var/run/flink/flink.pid' returned 1. -bash: /var/run/flink/flink.pid: 没有那个文件或目录

解决方案：

mkdir /var/run/flink

集成OpenTSDB问题

问题描述：

configure: error: cannot find install-sh, install.sh, or shtool in build-aux ".."/build-aux #2072

resource_management.core.exceptions.ExecutionFailed: Execution of 'cd /root/opentsdb; ./build.sh >> /var/log/opentsdb.log' returned 1. + test -f configure

+test -d build

+cd build

+test -f Makefile

+../configure
configure: error: cannot find install-sh, install.sh, or shtool in build-aux ".."/build-aux

解决方法：

这个错误由 configure 脚本文件中的如下行代码引起:

for ac_ dir in build-aux “$srcdir“”/build-aux; do

Configure文件在opentsdb-2.4.0.tar.gz 中，configure中改行改为如下即可：

for ac_ dir in build-aux $srcdir/build-aux; do

改完之后重新打压缩上传到安装包目录即可。

该问题已经在github上提交issue：https://github.com/OpenTSDB/opentsdb/issues/2072

OpenTSDB的WEBUI查询报错：

问题描述：Request failed: Internal Server Error net.opentsdb.core.IllegalDataException:Duplicate timestamp for key=[68, -110, -13, 90, 60, -97, 96, 0, 0, 1, 0, 0, 1, 0, 0, 2, 0, 0, 95, 0, 0, 3, 0, 0, 4], ms_offset=1356000, older=[66, -41, -21, -63], newer=[66, -41, -21, -60]; set tsd.storage.fix_duplicates=true to fix automatically or run Fsck

解决方案：

步骤1：在Opentsdb的配置文件opentsdb.conf当中，追加上这个配置条件tsd.storage.fix_duplicates=true即可

步骤2：修改ambari门户中opentsdb服务的启动命令，添加opentsdb.conf配置项

/build/tsdb tsd --zkbasedir=/hbase-unsecure --port=9999 --zkquorum=localhost:2181 --cachedir=/tmp/tsd --staticroot=build/staticroot --auto-metric --config src/opentsdb.conf

OpenTSDB的stop服务命令执行失败：

resource_management.core.exceptions.ExecutionFailed: Execution of 'pkill -TERM -P `cat /var/run/opentsdb/opentsdb.pid` >/dev/null 2>&1' returned 1.

解决方案：

修改 /var/lib/ambari-server/resources/stacks/HDP/3.0/services/OPENTSDB/package/scripts/master.py

注释掉：#Execute (format('pkill -TERM -P `cat {opentsdb_pidfile}` >/dev/null 2>&1'))

如下图：

centos7查看端口占用

上一篇：centos7安装SVN（centos7安装unzip命令）
下一篇：Centos常用的一些命令（centos 常用命令教程）