Keras is a deep learning neural network library written in Python that works on a high level. It is running on top of backend libraries like Tensorflow (or Theano, CNTK, etc.) which is capable of doing calculations on a low level, like multiplying tensors, convolutions and other operations.
This library has many pros, like, it is very easy to use once you get familiar with, it allows you to build a model of neural network in a few lines of code. It is highly supported by the community, it can run on top of many backend libraries as we mentioned earlier, can be executed on more than one GPUs and so on.
How to use Keras
pip install tensorflow
to install Keras, you need to install the backend first. In this example, we are going to install Tensorflow, as it is the most used and the most popular one.
By executing the command above, we are installing the CPU version of the library. However, we can install a version that can execute commands on the CPU and on the GPU, by executing the command below.
pip install tensorflow-gpu
After you are done with installing the backend library, you can install Keras by simply executing the next command.
pip install keras
Sometimes it can be tricky to install the backend, so I suggest you use some environment like Anaconda, or try to create a virtual environment in your project that will satisfy the requirements needed for the backend library.
The main structure in Keras is Model. There are many different types of it, but the most simple is the Sequential model.
You can easily create one, by simply creating an instance of the model, and adding the layers, as it is shown in the code below.
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(64, activation=‘relu’, input_shape=(32,)) )
model.add(Dense(125, activation=‘relu’))
model.add(Dense(512, activation=‘relu’))
model.add(Dense(10, activation=‘softmax’))
With the code above, we are importing the Sequential model, and we are importing the Dense layers. After that, we are creating an instance of the model, and we are adding layers to it.
The first three layers are fully connected, and they have a size of 64, 125 and 512 neurons respectively. All of these layers are using the ReLU activation function.
If you are not familiar with activation functions, you can read our post on Artificial Neural Networks.
The last layer is the output layer, and it has 10 neurons, and it is using the softmax activation function.
As there are more types of models, there are also more types of layers, here we are going to show you the three most popular ones.
Dense layers
The one type of layers we’ve seen so far is the Dense layers. These types of layers are fully connected. You can create a Dense layer by writing the code below:
Dense(125, activation=‘relu’)
The first argument is for the dimension of the output of this layer, and the second is for the activation function.
Convolutional layers
This type of layers are used for executing the convolution which produces feature maps as an output, which characterizes the input, which is usually an image. You can create Convolutional layers by writing the code below:
Conv1D(32, kernel_size=3, activation=‘relu’)- one dimensional convolution
Conv2D(32, kernel_size=3, activation=‘relu’)- two dimensional convolution
Conv3D(32, kernel_size=3, activation=‘relu’)- three dimensional convolution
The first parameter is for the number of output filters of the layer, the second is for the dimensions of the convolutional window, and the third is the activation function.
Recurrent layers
The Recurrent Neural Networks (RNN) and type of neural networks designed for learning from previous events, from where they pass the information to the next step.
In Keras, we have four types of recurrent layers, that you can create by writing the code below.
RNN(cell)
SimpleRNN(64, activation=‘relu’)
LSTM(64, activation=‘relu’)
GRU(64, activation=‘relu’)
The first argument if the RNN layer is the cell, which defines a class of the recurrent cell. The other three layers, have the same parameters, the first being the number of units, and the second being the activation function.
Compiling, training and evaluation of the model
The first step after creating the model is compiling it. By compiling your model, you are configuring the loss function and the optimizer used in the training.
model.compile(
loss=‘binary_crossentropy’,
optimizer=‘adam’,
metrics=[‘accuracy’]
)
With the code above, we are compiling the model. The first argument is the loss function, the second is the optimizer, the third is the metrics we are going to measure in order to determine the quality of the classification.
After compiling the model, we need to train it. We can do the training with the code below.
model.fit(X_train, Y_train, batch_size=32, nb_epoch=10, verbose=1)
The first argument is the input in the network, the second is the expected output, the third argument is the size of the batch, the fourth is the number of the epoch, the fifth argument shows you the progress of the training after every epoch.
After you’ve trained the model, you might want to save it. You can do it by writing the code below.
model.save(filepath)
It will save the model in HDF5 file, which will contain the architecture of the model. If we want to use the model after that, we need just to load it with the code below.
keras.models.load_model(filepath)
If you don’t want to save the whole model, you can save only the weights of the model.
model.save_weights(‘my_model_weights.h5‘)
Then you can load the weights when you need them.
model.load_weights(‘my_model_weights.h5‘)
To get the prediction of the testing set you need to use the function predict(). You can do it, by writing the code below.
model.predict(X_test, batch_size=32, verbose=1)
The first argument is the testing set, the second is the batch size, the third is showing us the progress.
To compute the output of an given input, as well as to compute the metrics of the model, and the lost function, you need to use the evaluate() function. You can do it, by writing the code below.
model.evaluate(X_test, Y_test, batch_size=32, verbose=1)
Working with real-world problem
We are going to create a model with fully connected layers to do classification on the wine based on its quality. The dataset that we are going to use belongs to UCI. We are going to need the winequality-white.csv file.
This file contains data of 4898 instances with 12 characteristics each. Eleven of those twelve characteristics are different objective tests, and then we have the quality value which is the average of at least three grades from wine experts (grades are from 0 (very bad) to 10 (very good)).
The expected output is a number between 0 to 10. Because we need to classify it, however, we need to change the expected output. We are going to create three classes bad(grade: 0 to 5), normal(grade: 6), great(grade: 7 to 10).
import numpy as np
import pandas as pd
from keras import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
df = pd.read_csv(‘winequality\\winequality\\winequality-white.csv’, sep=‘;’, header=0)
df.quality = df.quality.map(lambda x: np.array([1,0,0]) if x<=5 else (np.array([0,1,0]) if x==6 else np.array([0,0,1])))
output = np.array([x for x in df.quality.values])
data = df.drop(columns=[‘quality’]).values
classifier=Sequential()
classifier.add(Dense(64,activation=‘relu’,input_shape=(11,)))
classifier.add(Dense(64,activation=‘relu’))
classifier.add(Dense(64,activation=‘relu’))
classifier.add(Dense(3,activation=‘softmax’))
classifier.compile(optimizer = ‘adam’, loss = ‘binary_crossentropy’, metrics = [‘accuracy’])
x_train,x_test,y_train,y_test =train_test_split(data,output,test_size=0.2,random_state=10)
classifier.fit(x_train,y_train,steps_per_epoch=500,epochs=20,
validation_split=0.1,validation_steps=10)
classifier.evaluate(x_test,y_test)
500/500 [==============================] – 6s 12ms/step – loss: 0.4808 – acc: 0.7505 – val_loss: 0.6125 – val_acc: 0.7236
This is the output of the evaluation. After 20 epochs and 500 steps per epoch, we have an accuracy of 75.05%.
Now we are going to increase the number of steps per epoch to 1000 to see if that is going to increase the accuracy.
1000/1000 [==============================] – 16s 16ms/step – loss: 0.4009 – acc: 0.8068 – val_loss: 0.7447 – val_acc: 0.7509
This is the output of the evaluation. After 20 epochs and 1000 steps per epoch, we have an accuracy of 80.86%.
After increasing the number of steps, we got an accuracy of 80.86%. Now we are going to increase the number of epochs to 40, and get the steps per epoch to 500, to see what are we going to get.
500/500 [==============================] – 6s 12ms/step – loss: 0.4291 – acc: 0.7889 – val_loss: 0.6862 – val_acc: 0.7423
This is the output of the evaluation. After 40 epochs and 500 steps per epoch, we have an accuracy of 78.89%.
As we can see, after increasing the number of steps per epoch to 1000, to see if we will get the biggest accuracy with both the epochs and the steps per epochs increased from the original values.
1000/1000 [==============================] – 16s 16ms/step – loss: 0.3921 – acc: 0.8136 – val_loss: 0.8042 – val_acc: 0.7347
This is the output of the evaluation. After 40 epochs and 1000 steps per epoch, we have an accuracy of 81.36%.
With both parameters increased, we go the biggest accuracy.
Usually when you want to get good accuracy is about tuning the parameters the right way, that you get the highest possible accuracy value and avoid overfitting.
Conclusion
Deep Learning is such a vital part of Artificial Intelligence today. It helped with improving existing algorithms so that way we can get better results when we are using them.
Deep Learning is used in almost every Artificial Intelligence field, like, Computer Vision, Natural Language Processing, Reinforcement Learning (you can check OpenAI, DeepMind).
We hope that we spark a little interest in you so you will be learning more about Deep Learning and how can you combine it with other algorithms.
Like with every post we do, we encourage you to continue learning, trying and creating.