before you know about deep learning, you don’t know what intelligence is. intelligence is the process of reaching a decision by processing information. what people do. and the artificial form of human beings to come to a decision by thinking about this is artificial intelligence. machine learning is a branch of artificial intelligence.

machine learning is learning from examples. that’s how we humans learn. in the general program, we clearly state what needs to be done. in machine learning, i don’t say exactly what to do. give some data, the program outputs from the data by thinking on its own. machine learning is the ability of a program to think, despite not giving this clear instruction.

deep learning
deep learning is a subfield of machine learning where neural nets are used for machine learning.

neural network
the origin of artificial neurons mimicking neurons in our nerve system. and the sum of many neurons is the neural network. the neural net finds the pattern from the data. after that, the next new data can predate its output. Wikipedia has a beautiful image of the relationship between the three:

single neurons
each neuron performs very small computations. for example, if you give input, it makes a small calculation and gives another output.

the connection of the neun to the input has a weight, which we can call w. neurons basically learn to change the value of this weight. if it is trained after giving a lot of data, the neuron can set the value of this weight on its own.

as if this neuron of ours is initially assuming weight 3. i gave input 2. the function of the neuron is a small calculation. the input has to be multiplied with this weight. then 2*3 means 6. if the value of x is 2, then the value of y is 6. neurons change the value of this weight and learn what the output will give instead of the input. we got the equation y = wx. we are familiar with y = mx. which is a straight line equation.

the straight line y = wx will always pass through the origin value (0,0). if the pattern of our data is on the right or left side of the axis, then this equation (i.e. the neuron) will not be of much use. so we have to add a constant to it. this is what we call bias or b. we add a bias to each neuron. it is also a weight in which no data is passed.

the equation stands at y = wx + b. every neuron calculates just that much. the input gives the output by multiplying the weight to the input and adding the bias value.

we can also create machine learning models with single neurons. although more ideas will be needed, we will learn them a little later. before this, we have not seen how to create a machine learning model with a single neuron and a single input.
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

training data

celsius_as_training_data = np.array([-40, -10, 0, 8, 15, 22, 38], dtype=float)
fahrenheit_as_labels = np.array([-40, 14, 32, 46.4, 59, 71.6, 100.4], dtype=float)

model creation

model = keras.Sequential([
layers.Dense(units=1, input_shape=[1])
])

model compilation

model.compile(loss=’mean_squared_error’,
optimizer=keras.optimizers.Adam(0.1))

model training

model.fit(celsius_as_training_data, fahrenheit_labels, epochs=500, verbose=False)
print(“Finished training the model”)

predict

𝑓=𝑐×1.8+32 || for 100, f = 212

print(model.predict([100.0]))

Show more
there is a detailed article about the above example, which can be seen here: neural network training and prediction using TensorFlow

more than one input
a neuron can have more than one input. then multiply each input with the connection value connected to its neurons and add all the inputs.

for example, if we pass three inputs of our neurons, the output will be: y=w0x0+w1x1+w2x2+b. in this way, if there is more than one input, it will calculate.

layer
in neural networks, neurons are usually arranged according to layers. such as input layer, an output layer, hidden layer, etc. each network can have more than one hidden layer. let’s look at the picture below:

image source

There are two hidden layers in addition to the input layer and the output layer. The Hidden Layer is also called the Dense Layer. These layers are usually used to extract features from data transformers and data.

activation function
on the data we will work on, in some cases, we also have to work on non-linear relationships. and to find out the pattern of non-linear relationship data, activation functions are needed.

Apply a function to the output of each layer, this is the activation function. One of the common functions is ReLU. The rectified linear activation function or reLU is a very common function. If the input is positive, then the input will be reverted and the uptake will return 0. In addition to ReLU, there are other activation functions for different problems.

loss function
the function of the loss function is the model that predates the value and the difference between the target value.

one has to use one kind of loss function for one problem. for example, we can use the mean absolute error loss function to solve the problems related to regression.

optimizer
we know that neural net learns something by changing its weights. these weight changes are done by optimizer algorithms. the goal is to reduce losses. almost all of these optimization algorithms of deep learning fall within the stochastic gradient descent family. these algorithms work at every step of neuralnet’s training. the tasks that the optimization algorithm performs at each step are:

passes training data to receive prediction
after receiving the prediction, uses the loss function to extract the predicated value and the loss from the target value.
changes the weight values to reduce losses.
steps are run over and over again from one to three until the loss is minimal.
batch size
every time the model is trained with how much train data is done, so is the batch size.

Epoch / Epoch
an epoch is called completing the training once with all the training data. the more the epic number, the more time the model train will be on the train data.

the theory of the above topics is discussed. how it works in the background, how it works mathematically, are not discussed. the theory is usually needed. because we don’t write codes like neurons, loss functions, optimizers, etc. there are many libraries of machine learning, which contain codes. we just use it. if we do not know the theory even if we do not know the code, we will not understand what is the function of which one, why the code was written, etc.

From what we know so far, we train a model using these. For this, we will use heart attack analysis and prediction dataset. I’ll download the dataset first. You can download the data by clicking on the Data tab from this dataset page of Cagel.

if we want, we can write the code and run it. I’m using google collab.

open a notebook in the collab. then upload the data first.

for that, click on the file icon of the collab. after that, you will get the option to upload the file there. then upload the dataset. to read the dataset from the file:

reading data from file

dataset = pd.read_csv(‘heart.csv’)
if all goes well, you can check out the data info:

get dataset info

dataset.info()
if you look at the dataset info, you will see that there are a total of 14 columns. the output column is the target value. this target column needs to be removed from the training data to train the model.

labels or target value

labels = dataset[‘output’]

removing target data from training data

training_data = dataset.drop(columns=[‘output’])

get training data info

training_data.info()
if we look at the info of the training data, we will see that there are a total of 13 columns. we have removed the output column.

If you look at common machine learning projects, you will see that the data is shared or shared in the training set and test set. Ordinary X_train, X_test, y_train, y_test are written in this way. For simplicity, we will use the entire data as training data. And for this, at first, we created a model:

creating model

model = keras.Sequential([
layers.Dense(256, activation=’relu’, input_shape=[13]),
layers.Dense(128, activation=’relu’),
layers.Dense(64, activation=’relu’),
layers.Dense(1)
])
here input_shape=[13] is how much data i am giving as input. there are 13 columns [features] in our training data, so i’m giving you 13 as input_shape. it could have been done dynamically as well, i wrote it like this in siplicity.

after that, i can compile and train the model. while training, we will see how slowly the loss of the model is coming down. if you plot the loss data, you can visually see it. the full code:

import pandas as pd
from tensorflow import keras
from TensorFlow.keras import layers
import NumPy as np

reading data from file

dataset = pd.read_csv(‘heart.csv’)

get dataset info

dataset.info()

more about dataset

dataset.describe()

The first few items from the dataset

dataset.head()

labels or target value

labels = dataset[‘output’]

removing target data from training data

training_data = dataset.drop(columns=[‘output’])

creating model

model = keras.Sequential([
layers.Dense(256, activation=’relu’, input_shape=[13]),
layers.Dense(128, activation=’relu’),
layers.Dense(64, activation=’relu’),
layers.Dense(1)
])

model compile

model.compile(loss=’mae’, optimizer=’adam’)

model training

history = model.fit(training_data, labels, epochs=50, batch_size=128)

plotting loss history

history_df = pd.DataFrame(history.history)
history_df[‘loss’].plot();

Show more
the notebook of the example above. when trying the codes on your own, if you write the codes in different cells of the notebook, it will be easier to understand deep learning if each cell runs separately.

there is a tab called code when you go to the pages of the cage dataset. if I go to that tab, I’ll find a lot of code. some codes are written in a lot of detail, including tutorials. again, the code of a specific language can also be found in the filter. for example, I can filter TensorFlow code if I want, it to be easily found. in this way, you can learn a lot by watching deep learning-related projects.

there are many datasets in the Cagle. with these datasets, you can easily learn in the practice call on your own. how others write code, how they think, and how to solve a problem, will also be known by looking at the code. there is a lot of public data in Cagle’s datasets tab, you can practice them.

in this tutorial, we have got some of the basic ideas of deep learning. which is enough to get started. in the next post, we will try to find out overfitting & underfitting, dropout, transfer learning, etc.

What is deep learning? How it works !!

training data

model creation

model compilation

model training

predict

𝑓=𝑐×1.8+32 || for 100, f = 212

reading data from file

get dataset info

labels or target value

removing target data from training data

get training data info

creating model

reading data from file

get dataset info

more about dataset

The first few items from the dataset

labels or target value

removing target data from training data

creating model

model compile

model training

plotting loss history

Leave a Comment Cancel reply