Use Sparkling Water in Windows Environments¶
Windows environments require several additional steps to run Spark and Sparkling Water. A great summary of the configuration steps is available here.
To use Sparkling Water in Windows environments:
- Download the appropriate Spark distribution from the Spark Downloads page. 
- Point the - SPARK_HOMEvariable to the location of your Spark distribution:- SET SPARK_HOME=<location of your downloaded Spark distribution> 
- From https://github.com/steveloughran/winutils, download - winutils.exefor the Hadoop version that is referenced by your Spark distribution (For example, for- spark-3.1.2-bin-hadoop2.7.tgz, you need- wintutils.exefor Hadoop 2.7.)
- Move - winutils.exeinto a new directory- %SPARK_HOME%\hadoop\binand set:- SET HADOOP_HOME=%SPARK_HOME%\hadoop 
- Create a new file - %SPARK_HOME%\hadoop\conf\hive-site.xml, which sets up a default Hive scratch directory. The best location is a writable temporary directory, for example- %TEMP%\hive:- <configuration> <property> <name>hive.exec.scratchdir</name> <value>PUT HERE LOCATION OF TEMP FOLDER</value> <description>Scratch space for Hive jobs</description> </property> </configuration> - Note: You can also use the Hive default scratch directory, which is - c:\tmp\hive. In this case, you need to create the directory manually and call- winutils.exe chmod -R 777 c:\tmp\hiveto set up the correct permissions.
- Set the - HADOOP_CONF_DIRproperty:- SET HADOOP_CONF_DIR=%SPARK_HOME%\hadoop\conf 
- Go to the Sparkling Water directory and run the Sparkling Water shell: - bin/sparkling-shell.cmd