Friday, December 25, 2020

Deep Learning Hyperparameter Tuning example

 https://www.kaggle.com/jamesleslie/titanic-neural-network-for-beginners


titanic-neural-network-for-beginners  : 


Summary:


Create_model is the key concept in the whole algorithm.

def create_model(lyrs=[8], act='linear', pt='Adam', dr=0.0):


used GridsearchCV to find the best Hyperparameter Tuning.

Hyperparameters:  batch_size , epochs , optimizer , layers and drops 



Hyperparameter Tuning


Grid searchCV - batch size and epochs

batch_size = [16, 32, 64]

epochs = [50, 100]


Best: 0.822671 using {'batch_size': 32, 'epochs': 50}

===================================================

Grid searchCV - Optimization Algorithm

optimizer = ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Nadam']


Best: 0.822671 using {'opt': 'Adam'}


===================================================


Grid searchCV - Hidden neurons

       layers = [[8],[10],[10,5],[12,6],[12,8,4]]


Best: 0.822671 using {'lyrs': [8]}


===================================================



Grid searchCV - Dropout

drops = [0.0, 0.01, 0.05, 0.1, 0.2, 0.5]


Best: 0.824916 using {'dr': 0.2}


===================================================


model = create_model(lyrs=[8], dr=0.2)


training = model.fit(X_train, y_train, epochs=50, batch_size=32, 

                     validation_split=0.2, verbose=0)




Still have few questions -- 


a. Initial train model given val_acc: 86.53% but where as train model at the end given acc: 83.16%

b. making batch size and epochs as constant values and then the remaining hyperparmaters found the best value

   rather than adding one hyperparameter and then another hyperparameter.



Monday, December 7, 2020

Evolution of XGBoost Algorithm from Decision Trees

Credit to 

 https://towardsdatascience.com/https-medium-com-vishalmorde-xgboost-algorithm-long-she-may-rein-edd9f99be63d




XGBoost is a decision-tree-based ensemble Machine Learning algorithm that uses a gradient boosting framework. In prediction problems involving unstructured data (images, text, etc.) artificial neural networks tend to outperform all other algorithms or frameworks. However, when it comes to small-to-medium structured/tabular data, decision tree based algorithms are considered best-in-class right now.




Machine Learning Validation Techniques

Credit to 

 https://towardsdatascience.com/validating-your-machine-learning-model-25b4c8643fb7


The following methods for validation will be demonstrated:

  • k-Fold Cross-Validation
  • Leave-one-out Cross-Validation
  • Leave-one-group-out Cross-Validation
  • Nested Cross-Validation
  • Time-series Cross-Validation
  • Wilcoxon signed-rank test
  • McNemar’s test
  • 5x2CV paired t-test
  • 5x2CV combined F test