HDP3 되면서, 많은것이 바뀌었습니다
일단 눈에 띄는 변화는 Hadoop3 이 들어갔다는것과 제눈에 볼때 딱 달라진건 Falcon 이 없어진것
그리고 Flume 이 없어졌다는
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.3/bk_release-notes/content/deprecated_items.html
Deprecated Components and Product Capabilities
The following components are marked deprecated from HDP and will be removed in a future HDP release:
Component or Capability | Status | Marked Deprecated as of | Target Release for Removal | Comments |
---|---|---|---|---|
Apache Falcon | Deprecated | HDP 2.6.0 | HDP 3.0.0 | Contact your Hortonworks account team for the replacement options. |
Apache Flume | Deprecated | HDP 2.6.0 | HDP 3.0.0 | Consider Hortonworks DataFlow as an alternative for Flume use cases. |
Apache Mahout | Deprecated | HDP 2.6.0 | HDP 3.0.0 | Consider Apache Spark as an alternative depending on the workload. |
Apache Slider | Deprecated | HDP 2.6.0 | HDP 3.0.0 | Apache Slider functionality will be absorbed by Apache YARN. |
Cascading | Deprecated | HDP 2.6.0 | HDP 3.0.0 | |
Hue | Deprecated | HDP 2.6.0 | HDP 3.0.0 | Consider Ambari Views as the alternative. |
저기서 Flume 이 없어졌는데 HDP3 되면서 싹 없애버렸네요.
사실 Nifi 가 나오면서 그 자리를 Nifi가 대체해버린다고 했는데, 정작 Nifi 는 보이질 않습니다 (HDP 내에서)
엄밀히 말하면, nifi는 HDF(Hortonworks Data Flow)라는 새로운 플랫폼으로 제공이 되고 있습니다.(물론 호튼웍스에서 제공하는 RPM으로 깔아도 됩니다.)
하지만 NIFI를 쓰고 싶은데, 굳이 Ambari에서 관리하고 싶으니 ..
HDF를 설치하면 똑같이 Ambari에 Storm, Nifi가 들어간 플랫폼이 설치가 됩니다. 하지만 원하는건 HDP에 그대로 설치하는것이니
물론 Hortonworks 에서 관련옵션을 제공합니다
https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.2.0/planning-your-deployment/content/deployment_scenarios.html
Scenario | Installation Scenario | Steps |
---|---|---|
This scenario applies if you want to install the entire HDF platform, consisting of all flow management and stream processing components on a new cluster. The stream processing components include the new Streaming Analytics Manager (SAM) modules that are in GA (General Availability). This includes the SAM Stream Builder and Stream Operations modules but does not include installing the technical preview version of SAM Stream Insight, which is powered by Druid and Superset. This scenario requires that you install an HDF cluster. |
| |
This scenario applies to you if you are both an Hortonworks Data Platform (HDP) and HDF customer and you want to install a fresh cluster of HDP and add HDF services. The stream processing components include the new (SAM) and all of its modules. This includes installing the technical preview version of the SAM Stream Insight module, which is powered by Druid and Apache Superset. This scenario requires that you install both an HDF cluster and an HDP cluster. |
| |
You have an existing HDP cluster with Apache Storm and or Apache Kafka services and want to install Apache NiFi or NiFi Registry modules on that cluster. This requires that you upgrade to the latest version of Apache Ambari and HDP, and then use Ambari to add HDF services to the upgraded HDP cluster. |
|
Installing HDF Services on an Existing HDP Cluster
아마, 많은 경우는 기존에 HDP에 HDF올리는 방법이 생각됩니다.
1. upgrade Ambari -- 최신버전 Ambari로 업그레이드 하라는 내용입니다
2. Upgrade HDP -- 이건 HDP 2.6 쓰면 3버전으로 올리나는 내용입니다 --> 이미 HDP3 이니 패스
3. HDF에서 사용할 메타 DB에 내용을 추가하라는 내용입니다. 선택지는 oracle,postgres,mysql 이 있지만 일단 저는 postgres
https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.2.0/installing-hdf-and-hdp/content/installing_databases.html
해당 페이지에 있는 내용 그대로 postgres에 쿼리 몇줄만 쓰면됩니다. 거의다 create database, table 이런거라 기존 ambari가 사용하는 테이블은 안건드리니 하셔도 될것 같습니다
- Configure Postgres to Allow Remote Connections
It is critical that you configure Postgres to allow remote connections before you deploy a cluster. If you do not perform these steps in advance of installing your cluster, the installation fails.
- Configure SAM and Schema Registry Metadata Stores in Postgres
If you have already installed MySQL and configured SAM and Schema Registry metadata stores using MySQL, you do not need to configure additional metadata stores in Postgres.
- Configure Druid and Superset Metadata Stores in Postgres
Druid and Superset require a relational data store to store metadata. To use Postgres for this, install Postgres and create a database for the Druid metastore. If you have already created a data store using MySQL, you do not need to configure additional metadata stores in Postgres.
4. Install HDF Management Pack
Ambari 에 HDF 스택을 추가하는것입니다
https://docs.hortonworks.com/HDPDocuments/HDF3/HDF-3.1.2/bk_release-notes/content/ch_hdf_relnotes.html#repo-location
릴리즈 노트를 보시면 해당파일을 구하실수 있습니다
Table 1.5. RHEL/Oracle Linux/CentOS 6 HDF repository & additional download locations
Table 1.6. RHEL/Oracle Linux/CentOS 7 HDF repository & additional download locations
Table 1.7. SLES 11 SP3/SP4 HDF repository & additional download locations
Table 1.8. SLES 12 HDF repository & additional download locations
Table 1.9. Ubuntu 14 HDF repository & additional download locations
Table 1.10. Ubuntu 16 HDF repository & additional download locations
Table 1.11. Debian 7 HDF repository & additional download locations
이렇게 하시고 ambari를 재시작하시면 HDF가 추가되어 있습니다
Falcon 도 HDP3 되면서 없어지면
https://hortonworks.com/products/data-services/
DataPlane Service로 플랫폼 DPS?가 된것 같더군요. 아마도 이것도 이런식으로 추가해서 사용하면 되지 않을까.. 버전이 바뀔때마다 휙휙 바뀌니 손에 익어서 HDP는 계속 사용하고 있지만..
'Study > Bigdata' 카테고리의 다른 글
HDP3 에서 Spark 로 Hive Table 를 조회했는데 빈값이 나온경우 (0) | 2018.10.03 |
---|---|
HDP3 spark, pyspark, zepplin에서 database가 안보일때, (2) | 2018.09.19 |
HDP3 제플린(Zepplin) 스케쥴(Cron) 활성화 (0) | 2018.09.04 |
Spark(Yarn) + Intellj 원격 디버깅 하기 (0) | 2018.08.21 |
intellj, Spark Assembly (0) | 2018.08.17 |
Hive Metastore not working - Syntax error 'OPTION SQL_SELECT_LIMIT=DEFAULT' at line 1 (0) | 2018.08.02 |