Parameters of H2ODeepLearning¶
Affected Classes¶
- ai.h2o.sparkling.ml.algos.H2ODeepLearning
- ai.h2o.sparkling.ml.algos.classification.H2ODeepLearningClassifier
- ai.h2o.sparkling.ml.algos.regression.H2ODeepLearningRegressor
Parameters¶
- Each parameter has also a corresponding getter and setter method. (E.g.: - label->- getLabel(),- setLabel(...))
- activation
- Activation function. Possible values are - "Tanh",- "TanhWithDropout",- "Rectifier",- "RectifierWithDropout",- "Maxout",- "MaxoutWithDropout",- "ExpRectifier",- "ExpRectifierWithDropout".- Default value: - "Rectifier"- Also available on the trained model. 
- adaptiveRate
- Adaptive learning rate. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- ignoredCols
- Names of columns to ignore for training. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- initialBiases
- A array of weight vectors to be used for bias initialization of every network layer.If this parameter is set, the parameter ‘initialWeights’ has to be set as well. - Scala default value: - null; Python default value:- None
- initialWeights
- A array of weight matrices to be used for initialization of the neural network. If this parameter is set, the parameter ‘initialBiases’ has to be set as well. - Scala default value: - null; Python default value:- None
- aucType
- Set default multinomial AUC type. Possible values are - "AUTO",- "NONE",- "MACRO_OVR",- "WEIGHTED_OVR",- "MACRO_OVO",- "WEIGHTED_OVO".- Default value: - "AUTO"- Also available on the trained model. 
- autoencoder
- Auto-Encoder. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- averageActivation
- Average activation for sparse auto-encoder. #Experimental. - Default value: - 0.0- Also available on the trained model. 
- balanceClasses
- Balance training data class counts via over/under-sampling (for imbalanced data). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- calculateFeatureImportances
- Compute variable importances for input features (Gedeon method) - can be slow for large networks. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- categoricalEncoding
- Encoding scheme for categorical features. Possible values are - "AUTO",- "OneHotInternal",- "OneHotExplicit",- "Enum",- "Binary",- "Eigen",- "LabelEncoder",- "SortByResponse",- "EnumLimited".- Default value: - "AUTO"- Also available on the trained model. 
- classSamplingFactors
- Desired over/under-sampling ratios per class (in lexicographic order). If not specified, sampling factors will be automatically computed to obtain class balance during training. Requires balance_classes. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- classificationStop
- Stopping criterion for classification error fraction on training data (-1 to disable). - Default value: - 0.0- Also available on the trained model. 
- columnsToCategorical
- List of columns to convert to categorical before modelling - Scala default value: - Array(); Python default value:- []
- convertInvalidNumbersToNa
- If set to ‘true’, the model converts invalid numbers to NA during making predictions. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- convertUnknownCategoricalLevelsToNa
- If set to ‘true’, the model converts unknown categorical levels to NA during making predictions. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- detailedPredictionCol
- Column containing additional prediction details, its content depends on the model type. - Default value: - "detailed_prediction"- Also available on the trained model. 
- diagnostics
- Enable diagnostics for hidden layers. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- distribution
- Distribution function. Possible values are - "AUTO",- "bernoulli",- "quasibinomial",- "modified_huber",- "multinomial",- "ordinal",- "gaussian",- "poisson",- "gamma",- "tweedie",- "huber",- "laplace",- "quantile",- "fractionalbinomial",- "negativebinomial",- "custom".- Default value: - "AUTO"- Also available on the trained model. 
- elasticAveraging
- Elastic averaging between compute nodes can improve distributed model convergence. #Experimental. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- elasticAveragingMovingRate
- Elastic averaging moving rate (only if elastic averaging is enabled). - Default value: - 0.9- Also available on the trained model. 
- elasticAveragingRegularization
- Elastic averaging regularization strength (only if elastic averaging is enabled). - Default value: - 0.001- Also available on the trained model. 
- epochs
- How many times the dataset should be iterated (streamed), can be fractional. - Default value: - 10.0- Also available on the trained model. 
- epsilon
- Adaptive learning rate smoothing factor (to avoid divisions by zero and allow progress). - Scala default value: - 1.0e-8; Python default value:- 1.0E-8- Also available on the trained model. 
- exportCheckpointsDir
- Automatically export generated models to this directory. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- exportWeightsAndBiases
- Whether to export Neural Network weights and biases to H2O Frames. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- fastMode
- Enable fast mode (minor approximation in back-propagation). - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- featuresCols
- Name of feature columns - Scala default value: - Array(); Python default value:- []- Also available on the trained model. 
- foldAssignment
- Cross-validation fold assignment scheme, if fold_column is not specified. The ‘Stratified’ option will stratify the folds based on the response variable, for classification problems. Possible values are - "AUTO",- "Random",- "Modulo",- "Stratified".- Default value: - "AUTO"- Also available on the trained model. 
- foldCol
- Column with cross-validation fold index assignment per observation. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- forceLoadBalance
- Force extra load balancing to increase training speed for small datasets (to keep all cores busy). - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- hidden
- Hidden layer sizes (e.g. [100, 100]). - Scala default value: - Array(200, 200); Python default value:- [200, 200]- Also available on the trained model. 
- hiddenDropoutRatios
- Hidden layer dropout ratios (can improve generalization), specify one value per hidden layer, defaults to 0.5. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- huberAlpha
- Desired quantile for Huber/M-regression (threshold between quadratic and linear loss, must be between 0 and 1). - Default value: - 0.9- Also available on the trained model. 
- ignoreConstCols
- Ignore constant columns. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- initialWeightDistribution
- Initial weight distribution. Possible values are - "UniformAdaptive",- "Uniform",- "Normal".- Default value: - "UniformAdaptive"- Also available on the trained model. 
- initialWeightScale
- Uniform: -value…value, Normal: stddev. - Default value: - 1.0- Also available on the trained model. 
- inputDropoutRatio
- Input layer dropout ratio (can improve generalization, try 0.1 or 0.2). - Default value: - 0.0- Also available on the trained model. 
- keepCrossValidationFoldAssignment
- Whether to keep the cross-validation fold assignment. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- keepCrossValidationModels
- Whether to keep the cross-validation models. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- keepCrossValidationPredictions
- Whether to keep the predictions of the cross-validation models. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- l1
- L1 regularization (can add stability and improve generalization, causes many weights to become 0). - Default value: - 0.0- Also available on the trained model. 
- l2
- L2 regularization (can add stability and improve generalization, causes many weights to be small. - Default value: - 0.0- Also available on the trained model. 
- labelCol
- Response variable column. - Default value: - "label"- Also available on the trained model. 
- loss
- Loss function. Possible values are - "Automatic",- "Quadratic",- "CrossEntropy",- "ModifiedHuber",- "Huber",- "Absolute",- "Quantile".- Default value: - "Automatic"- Also available on the trained model. 
- maxAfterBalanceSize
- Maximum relative size of the training data after balancing class counts (can be less than 1.0). Requires balance_classes. - Scala default value: - 5.0f; Python default value:- 5.0- Also available on the trained model. 
- maxCategoricalFeatures
- Max. number of categorical features, enforced via hashing. #Experimental. - Default value: - 2147483647- Also available on the trained model. 
- maxRuntimeSecs
- Maximum allowed runtime in seconds for model training. Use 0 to disable. - Default value: - 0.0- Also available on the trained model. 
- maxW2
- Constraint for squared sum of incoming weights per unit (e.g. for Rectifier). - Scala default value: - 3.402823e38f; Python default value:- 3.402823E38- Also available on the trained model. 
- miniBatchSize
- Mini-batch size (smaller leads to better fit, larger can speed up and generalize better). - Default value: - 1- Also available on the trained model. 
- missingValuesHandling
- Handling of missing values. Either MeanImputation or Skip. Possible values are - "MeanImputation",- "Skip".- Default value: - "MeanImputation"- Also available on the trained model. 
- modelId
- Destination id for this model; auto-generated if not specified. - Scala default value: - null; Python default value:- None
- momentumRamp
- Number of training samples for which momentum increases. - Default value: - 1000000.0- Also available on the trained model. 
- momentumStable
- Final momentum after the ramp is over (try 0.99). - Default value: - 0.0- Also available on the trained model. 
- momentumStart
- Initial momentum at the beginning of training (try 0.5). - Default value: - 0.0- Also available on the trained model. 
- namedMojoOutputColumns
- Mojo Output is not stored in the array but in the properly named columns - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- nesterovAcceleratedGradient
- Use Nesterov accelerated gradient (recommended). - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- nfolds
- Number of folds for K-fold cross-validation (0 to disable or >= 2). - Default value: - 0- Also available on the trained model. 
- offsetCol
- Offset column. This will be added to the combination of columns before applying the link function. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- overwriteWithBestModel
- If enabled, override the final model with the best model found during training. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- predictionCol
- Prediction column name - Default value: - "prediction"- Also available on the trained model. 
- quantileAlpha
- Desired quantile for Quantile regression, must be between 0 and 1. - Default value: - 0.5- Also available on the trained model. 
- quietMode
- Enable quiet mode for less output to standard output. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- rate
- Learning rate (higher => less stable, lower => slower convergence). - Default value: - 0.005- Also available on the trained model. 
- rateAnnealing
- Learning rate annealing: rate / (1 + rate_annealing * samples). - Scala default value: - 1.0e-6; Python default value:- 1.0E-6- Also available on the trained model. 
- rateDecay
- Learning rate decay factor between layers (N-th layer: rate * rate_decay ^ (n - 1). - Default value: - 1.0- Also available on the trained model. 
- regressionStop
- Stopping criterion for regression error (MSE) on training data (-1 to disable). - Scala default value: - 1.0e-6; Python default value:- 1.0E-6- Also available on the trained model. 
- replicateTrainingData
- Replicate the entire training dataset onto every node for faster training on small datasets. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- reproducible
- Force reproducibility on small data (will be slow - only uses 1 thread). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- rho
- Adaptive learning rate time decay factor (similarity to prior updates). - Default value: - 0.99- Also available on the trained model. 
- scoreDutyCycle
- Maximum duty cycle fraction for scoring (lower: more training, higher: more scoring). - Default value: - 0.1- Also available on the trained model. 
- scoreEachIteration
- Whether to score during each iteration of model training. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- scoreInterval
- Shortest time interval (in seconds) between model scoring. - Default value: - 5.0- Also available on the trained model. 
- scoreTrainingSamples
- Number of training set samples for scoring (0 for all). - Scala default value: - 10000L; Python default value:- 10000- Also available on the trained model. 
- scoreValidationSamples
- Number of validation set samples for scoring (0 for all). - Scala default value: - 0L; Python default value:- 0- Also available on the trained model. 
- scoreValidationSampling
- Method used to sample validation dataset for scoring. Possible values are - "Uniform",- "Stratified".- Default value: - "Uniform"- Also available on the trained model. 
- seed
- Seed for random numbers (affects sampling) - Note: only reproducible when running single threaded. - Scala default value: - -1L; Python default value:- -1- Also available on the trained model. 
- shuffleTrainingData
- Enable shuffling of training data (recommended if training data is replicated and train_samples_per_iteration is close to #nodes x #rows, of if using balance_classes). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- singleNodeMode
- Run on a single node for fine-tuning of model parameters. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- sparse
- Sparse data handling (more efficient for data with lots of 0 values). - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- sparsityBeta
- Sparsity regularization. #Experimental. - Default value: - 0.0- Also available on the trained model. 
- splitRatio
- Accepts values in range [0, 1.0] which determine how large part of dataset is used for training and for validation. For example, 0.8 -> 80% training 20% validation. This parameter is ignored when validationDataFrame is set. - Default value: - 1.0
- standardize
- If enabled, automatically standardize the data. If disabled, the user must provide properly scaled input data. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- stoppingMetric
- Metric to use for early stopping (AUTO: logloss for classification, deviance for regression and anonomaly_score for Isolation Forest). Note that custom and custom_increasing can only be used in GBM and DRF with the Python client. Possible values are - "AUTO",- "deviance",- "logloss",- "MSE",- "RMSE",- "MAE",- "RMSLE",- "AUC",- "AUCPR",- "lift_top_group",- "misclassification",- "mean_per_class_error",- "anomaly_score",- "custom",- "custom_increasing".- Default value: - "AUTO"- Also available on the trained model. 
- stoppingRounds
- Early stopping based on convergence of stopping_metric. Stop if simple moving average of length k of the stopping_metric does not improve for k:=stopping_rounds scoring events (0 to disable). - Default value: - 5- Also available on the trained model. 
- stoppingTolerance
- Relative tolerance for metric-based stopping criterion (stop if relative improvement is not at least this much). - Default value: - 0.0- Also available on the trained model. 
- targetRatioCommToComp
- Target ratio of communication overhead to computation. Only for multi-node operation and train_samples_per_iteration = -2 (auto-tuning). - Default value: - 0.05- Also available on the trained model. 
- trainSamplesPerIteration
- Number of training samples (globally) per MapReduce iteration. Special values are 0: one epoch, -1: all available data (e.g., replicated training data), -2: automatic. - Scala default value: - -2L; Python default value:- -2- Also available on the trained model. 
- tweediePower
- Tweedie power for Tweedie regression, must be between 1 and 2. - Default value: - 1.5- Also available on the trained model. 
- useAllFactorLevels
- Use all factor levels of categorical variables. Otherwise, the first factor level is omitted (without loss of accuracy). Useful for variable importances and auto-enabled for autoencoder. - Scala default value: - true; Python default value:- True- Also available on the trained model. 
- validationDataFrame
- A data frame dedicated for a validation of the trained model. If the parameters is not set,a validation frame created via the ‘splitRatio’ parameter. - Scala default value: - null; Python default value:- None
- weightCol
- Column with observation weights. Giving some observation a weight of zero is equivalent to excluding it from the dataset; giving an observation a relative weight of 2 is equivalent to repeating that row twice. Negative weights are not allowed. Note: Weights are per-row observation weights and do not increase the size of the data frame. This is typically the number of times a row is repeated, but non-integer values are supported as well. During training, rows with higher weights matter more, due to the larger loss function pre-factor. - Scala default value: - null; Python default value:- None- Also available on the trained model. 
- withContributions
- Enables or disables generating a sub-column of detailedPredictionCol containing Shapley values. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- withLeafNodeAssignments
- Enables or disables computation of leaf node assignments. - Scala default value: - false; Python default value:- False- Also available on the trained model. 
- withStageResults
- Enables or disables computation of stage results. - Scala default value: - false; Python default value:- False- Also available on the trained model.