Ansible

What is Hive? Apache Hive is a distributed, fault-tolerant data warehouse system, built on top of Hadoop, designed to simplify and streamline the processing of large datasets. Through Hive, a user can manage and analyze massive volumes of data by organizing it into tables, resembling a traditional relational database. Hive uses the HiveQL (HQL) language, which is very similar to SQL. These SQL-like queries get translated into MapReduce tasks, leveraging the power of Hadoop’s MapReduce functionalities while bypassing the need to know how to program MapReduce jobs. ...

Ansible

Using Ansible to install Hive on a Spark cluster.

Using Ansible to remotely configure a cluster.