이야기박스

Hadoop 시리즈. Hive metastore 3버전 설치 본문

Computer & Data/Big Data

Hadoop 시리즈. Hive metastore 3버전 설치

박스님 2022. 1. 10. 00:05
반응형

이번 포스팅에서는 hive metastore 3 버전을 단독으로 실행하는 내용을 다루어보았습니다. 하이브를 생성하는 테스트는 다음 포스팅에 진행하고 이번 포스팅은 설치에 집중해보려고 합니다.

 

hive metastore 2버전의 설치는 아래 포스팅을 참고해주시기 바랍니다.

https://box0830.tistory.com/366

 

Hadoop 시리즈. Hive metastore 2버전 설치

이번 포스팅에서는 hive의 메타스토어 2버전 설치를 진행해보려고 합니다. 이전 포스팅인 hive metastore 3버전 설치와 방법은 유사하지만 2 버전에서는 standalone 바이너리 제공을 하지 않기 때문에,

box0830.tistory.com

Step 1. 준비

# jdk 설치

sudo apt-get install openjdk-8-jdk -y

 

# hadoop binary

# download
wget https://downloads.apache.org/hadoop/common/hadoop-3.2.2/hadoop-3.2.2.tar.gz

# 압축 해제
sudo tar -zxvf hadoop-3.2.2.tar.gz -C /opt/

# symbolic link
sudo ln -s /opt/hadoop-3.2.2 /opt/hadoop

# chown
sudo chown -R deploy. /opt/hadoop/

 

Step 2. MySQL 구성

# install mysql-server

sudo apt-get install mysql-server

 

# mysql 설정 작업

$ sudo vi /etc/mysql/mysql.conf.d/mysqld.cnf
port            = 59306
 
...
 
#bind-address           = 127.0.0.1

 

# restart

sudo service mysql restart

 

# check

$ netstat -anp | grep LISTEN
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:199             0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -
tcp6       0      0 :::59306                :::*                    LISTEN      -
tcp6       0      0 :::22                   :::*                    LISTEN      -

 

# 계정 작업

create user 'hive'@'localhost' identified by 'hive';
create user 'hive'@'%' identified by 'hive';
 
GRANT ALL PRIVILEGES ON *.* TO  'hive'@'localhost' IDENTIFIED BY 'hive';
GRANT ALL PRIVILEGES ON *.* TO  'hive'@'%' IDENTIFIED BY 'hive';
 
flush privileges;

 

Step 3. Metastore binary

Hive3 부터는 standalone 바이너리를 따로 제공해주어 이것을 이용하여 설치를 진행하였습니다.

# download
wget https://repo1.maven.org/maven2/org/apache/hive/hive-standalone-metastore/3.1.2/hive-standalone-metastore-3.1.2-bin.tar.gz

# 압축해제
sudo tar -zxvf hive-standalone-metastore-3.1.2-bin.tar.gz -C /opt/

# symbolic link
sudo ln -s /opt/apache-hive-metastore-3.1.2-bin /opt/metastore

# chown
sudo chown -R deploy. /opt/metastore/

 

# mysql connector 

아래 주소에 들어가서 설치하고 lib 경로에 넣어줍니다.

- https://dev.mysql.com/downloads/connector/j/

## cp
cp mysql-connector-java.jar /opt/metastore/lib/

 

# metasotre-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
 
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<!-- These are default values meant to allow easy smoke testing of the metastore.  You will
likely need to add a number of new values. -->
<configuration>
  <property>
    <name>metastore.thrift.uris</name>
    <value>thrift://localhost:9083</value>
    <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
  </property>
  <property>
    <name>metastore.task.threads.always</name>
    <value>org.apache.hadoop.hive.metastore.events.EventCleanerTask</value>
  </property>
  <property>
    <name>metastore.expression.proxy</name>
    <value>org.apache.hadoop.hive.metastore.DefaultPartitionExpressionProxy</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://story-hive-metastore-test-02:59306/hive?createDatabaseIfNotExist=true</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.cj.jdbc.Driver</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hive</value>
  </property>
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>hive</value>
  </property>
</configuration>

 

# 환경 변수 설정

HADOOP_HOME=/opt/hadoop
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64

 

Step 4. 스키마 초기화

/opt/metastore/bin/schematool -dbType mysql -initSchema

 

# Guava Error

작업 중 하둡의 단골 손님, Guava 에러가 발생하였습니다. 자세한 내용은 아래 접은글에 남겨두었습니다.

더보기

에러 메시지

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
    at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
    at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
    at org.apache.hadoop.hive.metastore.conf.MetastoreConf.newMetastoreConf(MetastoreConf.java:1179)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.<init>(MetastoreSchemaTool.java:104)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.run(MetastoreSchemaTool.java:1222)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.main(MetastoreSchemaTool.java:1178)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

 

문제

metastore와 hadoop common의 guava 버전이 다름

$ ls -tal /opt/metastore/lib/*guava*
-rw-r--r-- 1 deploy deploy 2308517 Sep 26  2018 /opt/metastore/lib/guava-19.0.jar
 
$ ls -tal /opt/hadoop/share/hadoop/common/lib/*guava*
-rw-r--r-- 1 deploy deploy 2747878 Jan  3  2021 /opt/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar
-rw-r--r-- 1 deploy deploy    2199 Jan  3  2021 /opt/hadoop/share/hadoop/common/lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar

 

조치 

27.0 버전으로 통일

rm -f /opt/metastore/lib/guava-19.0.jar
cp /opt/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar /opt/metastore/lib

 

참고

 

Guava 충돌 - hadoop 라이브러리 ; maven-shade-plugin

# 개요 지난번 포스팅처럼 Guava 라이브러리 이슈가 발생하여 기록을 남깁니다. 이번에는 Exclude로 충돌되는 패키지를 제외시키는 방법이 아닌, re-packaging 방식을 사용하는 포스팅을 구성하였습니

box0830.tistory.com

 

# 실행 결과

....

511/511      -- Dump completed on 2012-08-23  0:56:31
Closing: com.mysql.cj.jdbc.ConnectionImpl
sqlline version 1.3.0
Initialization script completed
schemaTool completed
2021-12-22 10:06:16,465 shutdown-hook-0 INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory.
2021-12-22 10:06:16,469 shutdown-hook-0 INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory.
2021-12-22 10:06:16,478 shutdown-hook-0 WARN Unable to register Log4j shutdown hook because JVM is shutting down. Using SimpleLogger

 

# MySQL 확인

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| hive               |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
5 rows in set (0.00 sec)
mysql> use hive;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
 
Database changed
mysql> show tables;
+-------------------------------+
| Tables_in_hive                |
+-------------------------------+
| AUX_TABLE                     |
| BUCKETING_COLS                |
| CDS                           |
| COLUMNS_V2                    |
| COMPACTION_QUEUE              |
| COMPLETED_COMPACTIONS         |
| COMPLETED_TXN_COMPONENTS      |
| CTLGS                         |
| DATABASE_PARAMS               |
| DBS                           |
| DB_PRIVS                      |
| DELEGATION_TOKENS             |
| FUNCS                         |
| FUNC_RU                       |
| GLOBAL_PRIVS                  |
| HIVE_LOCKS                    |
| IDXS                          |
| INDEX_PARAMS                  |
| I_SCHEMA                      |
| KEY_CONSTRAINTS               |
| MASTER_KEYS                   |
| MATERIALIZATION_REBUILD_LOCKS |
| METASTORE_DB_PROPERTIES       |
| MIN_HISTORY_LEVEL             |
| MV_CREATION_METADATA          |
| MV_TABLES_USED                |
| NEXT_COMPACTION_QUEUE_ID      |
| NEXT_LOCK_ID                  |
| NEXT_TXN_ID                   |
| NEXT_WRITE_ID                 |
| NOTIFICATION_LOG              |
| NOTIFICATION_SEQUENCE         |
| NUCLEUS_TABLES                |
| PARTITIONS                    |
| PARTITION_EVENTS              |
| PARTITION_KEYS                |
| PARTITION_KEY_VALS            |
| PARTITION_PARAMS              |
| PART_COL_PRIVS                |
| PART_COL_STATS                |
| PART_PRIVS                    |
| REPL_TXN_MAP                  |
| ROLES                         |
| ROLE_MAP                      |
| RUNTIME_STATS                 |
| SCHEMA_VERSION                |
| SDS                           |
| SD_PARAMS                     |
| SEQUENCE_TABLE                |
| SERDES                        |
| SERDE_PARAMS                  |
| SKEWED_COL_NAMES              |
| SKEWED_COL_VALUE_LOC_MAP      |
| SKEWED_STRING_LIST            |
| SKEWED_STRING_LIST_VALUES     |
| SKEWED_VALUES                 |
| SORT_COLS                     |
| TABLE_PARAMS                  |
| TAB_COL_STATS                 |
| TBLS                          |
| TBL_COL_PRIVS                 |
| TBL_PRIVS                     |
| TXNS                          |
| TXN_COMPONENTS                |
| TXN_TO_WRITE_ID               |
| TYPES                         |
| TYPE_FIELDS                   |
| VERSION                       |
| WM_MAPPING                    |
| WM_POOL                       |
| WM_POOL_TO_TRIGGER            |
| WM_RESOURCEPLAN               |
| WM_TRIGGER                    |
| WRITE_SET                     |
+-------------------------------+
74 rows in set (0.00 sec)

 

Step 5. 메타스토어 실행

$ nohup /opt/metastore/bin/start-metastore 2>&1 &
$ cat nohup.out
2021-12-23 02:48:24: Starting Metastore Server
2021-12-23 02:48:27,842 main INFO Log4j appears to be running in a Servlet environment, but there's no log4j-web module available. If you want better web container support, please add the log4j-web JAR to your web archive or server lib directory.

 

# 확인

$ ps -ef | grep metastore
deploy    3256  2672 99 02:48 pts/0    00:00:15 /usr/lib/jvm/java-1.8.0-openjdk-amd64/bin/java -Dproc_jar -Dproc_metastore -Dlog4j.configurationFile=metastore-log4j2.properties -Dyarn.log.dir=/opt/hadoop-3.2.2/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/hadoop-3.2.2 -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/hadoop-3.2.2/lib/native -Xmx256m -Dhadoop.log.dir=/opt/hadoop-3.2.2/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop-3.2.2 -Dhadoop.id.str=deploy -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /opt/metastore/lib/hive-standalone-metastore-3.1.2.jar org.apache.hadoop.hive.metastore.HiveMetaStore
deploy    3410  2672  0 02:48 pts/0    00:00:0
$ netstat -anp | grep LISTEN
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:199             0.0.0.0:*               LISTEN      -
tcp6       0      0 :::22                   :::*                    LISTEN      -
tcp6       0      0 :::9083                 :::*                    LISTEN      3256/java

 

# 스키마 초기화 단계에서 발생한 Warning

특정 테이블이 Deprecated 되어 아래와 같은 워닝 메시지가 발생하였습니다. 참고용으로 남겨둡니다.

더보기
Warning: Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release. (state=HY000,code=3090)
java.sql.SQLWarning: Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release.
    at com.mysql.cj.protocol.a.NativeProtocol.convertShowWarningsToSQLWarnings(NativeProtocol.java:2138)
    at com.mysql.cj.jdbc.StatementImpl.getWarnings(StatementImpl.java:1731)
    at sqlline.Commands.execute(Commands.java:849)
    at sqlline.Commands.sql(Commands.java:733)
    at sqlline.SqlLine.dispatch(SqlLine.java:795)
    at sqlline.SqlLine.runCommands(SqlLine.java:1706)
    at sqlline.Commands.run(Commands.java:1317)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
    at sqlline.SqlLine.dispatch(SqlLine.java:791)
    at sqlline.SqlLine.initArgs(SqlLine.java:595)
    at sqlline.SqlLine.begin(SqlLine.java:643)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.runSqlLine(MetastoreSchemaTool.java:1034)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.runSqlLine(MetastoreSchemaTool.java:1007)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.doInit(MetastoreSchemaTool.java:596)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.doInit(MetastoreSchemaTool.java:574)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.run(MetastoreSchemaTool.java:1273)
    at org.apache.hadoop.hive.metastore.tools.MetastoreSchemaTool.main(MetastoreSchemaTool.java:1178)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

 

반응형