H2O Documentation
Starting Downladed Jar, EC2, and Java tips
Getting Started from a Downloaded Zip File
H
2
O on EC2
For R Console and R Studio Users
H
2
O installation in R Console From Download Table
H
2
O + R For Developers Using Git
R Package Documentation
Walk-Through Tutorials
GLM Tutorial
GLM Grid Tutorial
K Means Tutorial
Random Forest Tutorial
PCA Tutorial
GBM Tutorial
GBM Grid Tutorial
R Tutorial
Quick Start Videos
Getting Started from H
2
O Zip File
Getting Started with H
2
O in R
GLM
K-Means
H
2
O Command-line Options
JVM Options
H
2
O Options
Cloud formation behavior
H
2
O on Hadoop
Quick Start for Running H
2
O on Hadoop
H
2
O on Hadoop Guide
H
2
O on a Multi-Node Cluster
Introduction to H
2
O
H
2
O Algorithms Roadmap
Glossary
Public Data Sets
Open City Datasets
Transportation and Travel
Sciences and Engineering
Diverse Data Sets
Public Policy Data
Other
H
2
O Performance Datasheet
R Package Documentation
Data Science in H
2
O
Generalized Linear Model (GLM)
K-Means
Random Forest (RF)
Random Forest Data Science
Principal Components Analysis
Summary
Gradient Boosted Regression and Classification
Naive Bayes
Deep Learning
Data Science and Machine Learning
SGD
References
Benchmarks
Getting Started with Development in H
2
O
From Source Code (Github)
Updating H
2
O from Github
H
2
O For Eclipse users (Github)
For IDEA users (Github)
Setting up a H
2
O Hadoop cluster on a Mac
Scala for H
2
O: Shalala
Java API
REST/JSON API
H
2
O Community
Get In Touch
Learn About H
2
O
H
2
OLicense
Troubleshooting H
2
O
Download and Send Us Your Logs
Other Common Troubleshooting Topics
H2O On Windows
Tunneling between servers with H
2
O
Contact Us
HA considerations for H2O
H2O Documentation
Docs
»
Data Science in H
2
O
View page source
Data Science in H
2
O
ΒΆ
Generalized Linear Model (GLM)
Defining a GLM Model
GLMgrid Models
Interpreting a Model
Validate GLM
Cross Validation
Cost of Computation
GLM Algorithm
References
K-Means
When to use K-Means
Defining a K-Means model
Interpreting a Model
References
K-Means Algorithm
References
Random Forest (RF)
When to use RF
Defining a Model
Interpreting Results
RF Error Rates
Random Forest Data Science
Principal Components Analysis
Defining a PCA Model
Interpreting Results
Notes on the application of PCA
Summary
Inputs
Output
Gradient Boosted Regression and Classification
Defining a GBM Model
Treatment of Factors
Interpreting Results
GBM Algorithm
Reference
Naive Bayes
Defining a Naive Bayes Model
Naive Bayes Algorithm and Implementation
Deep Learning
Defining a Deep Learning Model
Interpreting the Model
References
Data Science and Machine Learning
SGD
References
References
Recommended Reading
GLM
Poisson
Logistic (binomial and multinomial)
GBM
Neural Networks
Tweedie
K-Means