What is this document?¶
CAUTION: These playbooks are under refactoring to use Bigtop to construct clusters. This is why, documents should be also refactored after changing playbooks.
This is a document for ansible-bigdata . ‘ansible-bigdata’ is Ansible playbooks to configure Hadoop environments.
Content¶
- 1. Abstract
- 2. About hosts(inventory file)
- 3. About groups in inventory
- 4. How to use this playbooks
- 4.1. Assumption of this section
- 4.2. How to configure Ansible execution environment
- 4.3. How to boot EC2 instances for Hadoop cluster
- 4.4. How to configure host names of nodes
- 4.5. How to configure Bigtop HDFS/YARN environment
- 4.6. How to configure Bigtop Pseudo environment
- 4.7. How to install Ganglia environment
- 4.8. How to install and configure InfluxDB and Grafana
- 4.9. How to install Spark community edition
- 4.10. Configure Zeppelin
- 4.11. Configure Kafka cluster
- 4.12. Configure Confluent services
- 4.13. Configure Ambari
- 4.14. Configure Jenkins
- 4.15. Configure Anaconda CE
- 4.16. Configure Hive
- 4.17. Configure Pseudo Alluxio
- 4.18. Configure Alluxio on YARN
- 4.19. Configure TPC-DS
- 4.20. Configure Keras and Tensorflow
- 5. About playbooks
- 5.1. Playbooks for configuration
- 5.1.1. common
- 5.1.2. cdh5
- 5.1.3. cdh5_pseudo
- 5.1.4. ansible
- 5.1.5. ganglia
- 5.1.6. influxdb
- 5.1.7. spark_comm
- 5.1.8. zeppelin
- 5.1.9. fluentd
- 5.1.10. kafka
- 5.1.11. confluent
- 5.1.12. ambari
- 5.1.13. jenkins
- 5.1.14. anacondace
- 5.1.15. postgresql
- 5.1.16. cdh5_hive
- 5.1.17. alluxio_yarn
- 5.1.18. tpc_ds
- 5.1.19. tensorflow
- 5.2. Playbooks for operation
- 5.1. Playbooks for configuration
- 6. About roles
- 6.1. Roles to configure basic environments
- 6.2. Roles to configure Ansible
- 6.3. Roles to boot EC2 instances for Hadoop cluster
- 6.4. Roles to configure CDH5 Hadoop
- 6.5. Roles to configure CDH5 pseudo Hadoop
- 6.6. Roles to configure Spark core on client node
- 6.7. Roles to configure Ganglia
- 6.8. Roles to configure InfluxDB and Grafana
- 6.9. Roles to configure Zeppelin
- 6.10. Roles to configure fluentd or td-agent
- 6.11. Roles to configure Kafka
- 6.12. Roles to configure Confluent
- 6.13. Roles to configure Ambari
- 6.14. Roles to configure CI environment
- 6.15. Roles to configure Anaconda CE
- 6.16. Roles to configure PostgreSQL
- 6.17. Roles to configure Hive
- 6.18. Roles to configure Alluxio
- 6.19. Roles to configure TPC-DS