neural network course Archives - DexLab Analytics | Big Data Hadoop SAS R Analytics Predictive Modeling & Excel VBA

## Skewness and Kurtosis: A Definitive Guide

While dealing with data distribution, Skewness and Kurtosis are the two vital concepts that you need to be aware of. Today, we will be discussing both the concepts to help your gain new perspective.

Skewness gives an idea about the shape of the distribution of your data. It helps you identify the side towards which your data is inclined. In such a case, the plot of the distribution is stretched to one side than to the other. This means in case of skewness we can say that the mean, median and mode of your dataset are not equal and does not follow the assumptions of a normally distributed curve.

Positive skewness:- When the curve is stretched towards the right side more it is called a positively skewed curve. In this case mean is greater than median and median is the greater mode

(Mean>Median>Mode)

Let’s see how we can plot a positively skewed graph using python programming language.

• First we will have to import all the necessary libraries.

• Then let’s create a data using the following code:-

In the above code we first created an empty list and then created a loop where we are generating a data of 100 observations. The initial value is raised by 0.1 and then each observation is raised by the loop count.

• To get a visual representation of the above data we will be using the Seaborn library and to add more attributes to our graph we will use the Matplotlib methods.

In the above graph you can see that the data is stretched towards right, hence the data is positively skewed.

• Now let’s cross validate the notion that whether Mean>Median>Mode or not.

Since each observation in the dataset is unique mode cannot be calculated.

#### Calculation of skewness:

Formula:-

• In case we have the value of mode then skewness can be measured by Mode ─ Mean
• In case mode is ill-defined then skewness can be measured by 3(Mean ─ Median)
• To obtain relative measures of skewness, as in dispersion we use the following formula:-

When mode is defined:-
When mode is ill-defined:-

To calculate positive skewness using Python programming language we use the following code:-

Negative skewness:- When the curve is stretched towards left side more it is called a negatively skewed curve. In this case mean is less than median and median is  mode.

(Mean<Median<Mode)

Now let’s see how we can plot a negatively skewed graph using python programming language.

Since we have already imported all the necessary libraries we can head towards generating the data.|

In the above code instead of raising the value of observation we are reducing it.

• To visualize the data we have created again we will use the Seaborn and Matplotlib library.

The above graph is stretched towards left, hence it is negatively skewed.

• To check whether Mean<Median<Mode or not again we will be using the following code:-

The above result shows that the value of mean is less than mode and since each observation is unique mode cannot be calculated.

• Now let’s calculate skewness in Python.

#### Kurtosis

Kurtosis is nothing but the flatness or the peakness of a distribution curve.

• Platykurtic :- This kind of distribution has the smallest or the flattest peak.
• Misokurtic:- This kind of distribution has a medium peak.
• Leptokurtic:- This kind of distribution has the highest peak.

The video attached below will help you clear any query you might have.

So, this was the discussion on the Skewness and Kurtosis, at the end of this you have definitely become familiar with both concepts. Dexlab Analytics blog has informative posts on diverse topics such as neural network machine learning python which you need to explore to update yourself. Dexlab Analytics offers cutting edge courses like machine learning certification courses in gurgaon.

.

## Learn How To Do Image Recognition Using LSTM

This is a tutorial where we teach you to do image recognition using LSTM. To get to the core you have to understand that how a convolutional neural network perceives the data. In this tutorial the data we have is four-dimensional data, so, you need to convert the dataset accordingly. You can find the tutorial video attached to this blog.

Now suppose there is an image 28 by 28 pixel, if the image is black and white then there would be only one channel. So how will you put the data in CNN, it will be like the number of samples, then followed by the number of rows of the data, then the number of columns, then channels. These are the four values that need to be provided in the input layer, at the very beginning. Now, these values must be converted according to the LSTM. Now the LSTM wants the STF, like the number of samples, time steps like how many time steps back you want to go for making further prediction because LSTM is a sequence generator and the number of features. So, we will be converting the image that is the number of sample 28 by 28 one pixel into one sample of 28 by 28, that’s the only job you have to do and all you need to accomplish this is to prepare the data accordingly.

There will be no mysteries here, in fact, it is a normal neural network LSTM, that anybody can run in a most simple form, and in this tutorial, it is also run in the most simple form there is no complexity involved and only a few epochs will be run.

#### You can find the code sheet you need for this at

Also follow this video that explains the process step by step, so that you can easily grasp how LSTM can be used for the purpose of image recognition. To access more informative tutorial sessions like this follow the DexLab Analytics blog.

.

## How AI is Reshaping The Finance Industry?

Technology is bringing about rapid changes in almost every field it touches. Traditional finance tools no longer suit the current tech-friendly generation of investors who are now used to getting information, service at their fingertips. Unless the gap is bridged, it would be hard for firms to retain any clients. Some of the financial firms have already started investing in AI technology to develop a business model that satisfies the changing requirements of the customers and leverages their business.

The adoption of AI has finally enabled the firms to have access to customer-centric information to develop a plan that suits their individual financial goals and offer customer-centric solutions to offer a personalized experience.

AI is impacting the financial industry in more ways than one. Let’s take a look

#### Mitigating risks

The application of AI has enabled institutes to assess risk factors and mitigate risk. Implementation of AI tools allows the processing of a huge amount of financial records that comprise structured as well as unstructured data to recognize patterns and predict the risk factors. So, while approving a loan, for example, an institute could be better prepared as it would be able to identify those customers who are likely to default and having personnel with a background in credit risk management courses can certainly be of immense help here.

#### Detecting fraud

One of the most niggling issues faced by the banking institutes is a fraud, and with AI application being available fraud identification gets easier. When any such case happens it becomes almost impossible for institutes to recover the money. Along with that the banks especially also have to deal with false positives cases that can harm their business. Credit card fraud cases also have become rampant and give customers and banks sleepless nights. AI technology could be a great weapon in fighting and preventing such cases. By analyzing data regarding the transaction of a customer, his behavior, spending habits, past cases if any, an oddity could be easily spotted and an alarm could be sent to monitor the situation and take measures accordingly.

Investment always comes with a set of risks, the changing market scenario could certainly put your money in a volatile situation. However, with AI in place, the large datasets could be easily handled, and detecting market situations can help to make investors aware of the trends and they can change their investment decision accordingly. Faster data processing leads to quick decision making and coupled with an accurate prediction of the market situation, trading gets smarter as an investor can buy or, sell stock as per stock trends and stay risk-free.

#### Personalized banking experience

The integration of AI can offer customers a personalized financial experience. The chatbots are there to help the customers manage their affairs without needing any intervention. Be it checking balance or, scheduling payments everything is streamlined. In addition to this, the customers now have access to apps that help keep their financial transactions in check, track their investments, and plan finances without any hassle. There have been a dynamic progress in the field of NLP and the chatbots being developed now are getting smarter than ever and pursuing a natural language processing course in gurgaon, could lead to lucrative job opportunities.

#### Process Automation

Every financial institution needs to run operations with maximum efficiency while adopting cost-cutting measures. The adoption of RPA has significantly changed the way these institutes function. Manual tasks which require time and labor could easily be automated and there would be fewer chances of error. Be it data verification or, report generation every single task could be well taken care of.

#### Examples of AI implementation in finance

• Zest Automated Machine Learning (ZAML) is a platform that offers underwriting solutions. Borrowers with little or, no past credit history could be assessed.
• Kensho combines the power of NLP and cloud computing to offer analytical solutions
• Ayasdi provides anti-money laundering (AML) detection solutions to financial institutes
• Abe AI is a virtual assistant that helps users with budgeting and saving while allowing them to track spending.
• Darktrace offers cyber security solutions to financial firms

The powerful ways AI is helping the financial institutes excel in their field indicate a promising future ahead. However, the integration is slowly taking place, and still, there is some uncertainty regarding the technology. With proper training from an analytics lab could help bridge the knowledge gap and thus ensure full integration of this dynamic technology.

.

## Top Six Applications of Natural Language Processing (NLP)

Words are all around us – in the form of spoken language, texts, sound bytes and even videos. The world would have been a chaotic place had it not been for words and languages that help us communicate with each other.

Now, if we were to enhance language with the attributes of artificial intelligence, we would be working with what is known as Natural Language Processing or NLP – the confluence of artificial intelligence and computational linguistics.

In other words, “NLP is the machine’s ability to process what was said to it, structure the information received, determine the necessary response and respond in a language that we understand”.

Here is a list of popular applications of NLP in the modern world.

#### 1. Machine Translation

When a machine translates from one language to another, “we deal with “Machine” Translation. The idea behind MT is simple — to develop computer algorithms to allow automatic translation without any human intervention. The best-known application is probably Google Translate.”

#### 2. Voice and Speech Recognition

Though voice recognition technology has been around for 50 years, it is only in the last few decades, owing to NLP, have scientists achieved significant success in the field. “Now we have a whole variety of speech recognition software programs that allow us to decode the human voice,”be it in mobile telephony, home automation, hands-free computing, virtual assistance and video games.

#### 3. Sentiment Analysis

“Sentiment analysis (also known as opinion mining or emotion AI) is an interesting type of data mining that measures the inclination of people’s opinions. The task of this analysis is to identify subjective information in the text”. Companies use sentiment analysis to keep abreast of their reputation and customer satisfaction.

Question-Answering concerns building systems that “automatically answer questions posed by humans in a natural language”. The real examples of Question-Answering applications are: Siri, OK Google, chat boxes and virtual assistants.

#### 5. Automatic Summarization

Automatic Summarization is the process of creating a short, accurate, and fluent summary of a longer text document. The most important advantage of using a summary is it reduces the time taken to read a piece of text. Here are some applications – Aylien Text Analysis, MeaningCloud Summarization, ML Analyzer, Summarize Text and Text Summary.

#### 6. Chatbots

Chatbots currently operate on several channels like the Internet, web applications and messaging platforms. “Businesses today are interested in developing bots that can not only understand a person but also communicate with him at one level”.

While such applications celebrate the use of NLP in modern computing, there are some glitches that arise in systems that cannot be ignored. “The very nature of human natural language makes some NLP tasks difficult…For example, the task of automatically detecting sarcasm, irony, and implications in texts has not yet been effectively solved. NLP technologies still struggle with the complexities inherent in elements of speech such as similes and metaphors.”

To know more, do take a look at the DexLab Analytics website. DexLab Analytics is a premiere institute that trains professionals in NLP deep learning classification in Delhi.

.

## Deep Learning — Applications and Techniques

Deep learning is a subset of machine learning, a branch of artificial intelligence that configures computers to perform tasks through experience. While classic machine-learning algorithms solved many problems, they are poor at dealing with soft data such as images, video, sound files, and unstructured text.

Deep-learning algorithms solve the same problem using deep neural networks, a type of software architecture inspired by the human brain (though neural networks are different from biological neurons). Neural Networks are inspired by our understanding of the biology of our brains – all those interconnections between the neurons. But, unlike a biological brain where any neuron can connect to any other neuron within a certain physical distance, these artificial neural networks have discrete layers, connections, and directions of data propagation.

The data is inputted into the first layer of the neural network. In the first layer individual neurons pass the data to a second layer. The second layer of neurons does its task, and so on, until the final layer and the final output is produced. Each neuron assigns a weighting to its input — how correct or incorrect it is relative to the task being performed. The final output is then determined by the total of those weightings.

### Deep Learning Use Case Examples

#### Robotics

Many of the recent developments in robotics have been driven by advances in AI and deep learning. Developments in AI mean we can expect the robots of the future to increasingly be used as human assistants. They will not only be used to understand and answer questions, as some are used today. They will also be able to act on voice commands and gestures, even anticipate a worker’s next move. Today, collaborative robots already work alongside humans, with humans and robots each performing separate tasks that are best suited to their strengths.

#### Agriculture

AI has the potential to revolutionize farming. Today, deep learning enables farmers to deploy equipment that can see and differentiate between crop plants and weeds. This capability allows weeding machines to selectively spray herbicides on weeds and leave other plants untouched. Farming machines that use deep learning–enabled computer vision can even optimize individual plants in a field by selectively spraying herbicides, fertilizers, fungicides and insecticides.

#### Medical Imaging and Healthcare

Deep learning has been particularly effective in medical imaging, due to the availability of high-quality data and the ability of convolutional neural networks to classify images. Several vendors have already received FDA approval for deep learning algorithms for diagnostic purposes, including image analysis for oncology and retina diseases. Deep learning is also making significant inroads into improving healthcare quality by predicting medical events from electronic health record data.  Earlier this year, computer scientists at the Massachusetts Institute of Technology (MIT) used deep learning to create a new computer program for detecting breast cancer.

Here are some basic techniques that allow deep learning to solve a variety of problems.

#### Fully Connected Neural Networks

Fully Connected Feed forward Neural Networks are the standard network architecture used in most basic neural network applications.

In a fully connected layer each neuron is connected to every neuron in the previous layer, and each connection has its own weight. This is a totally general purpose connection pattern and makes no assumptions about the features in the data. It’s also very expensive in terms of memory (weights) and computation (connections).

Each neuron in a neural network contains an activation function that changes the output of a neuron given its input. These activation functions are:

• Linear function: – it is a straight line that essentially multiplies the input by a constant value.
•  Sigmoid function: – it is an S-shaped curve ranging from 0 to 1.
• Hyperbolic tangent (tanH) function: – it is an S-shaped curve ranging from -1 to +1
• Rectified linear unit (ReLU) function: – it is a piecewise function that outputs a 0 if the input is less than a certain value or linear multiple if the input is greater than a certain value.

Each type of activation function has pros and cons, so we use them in various layers in a deep neural network based on the problem. Non-linearity is what allows deep neural networks to model complex functions.

#### Convolutional Neural Networks

Convolutional Neural Networks (CNN) is a type of deep neural network architecture designed for specific tasks like image classification. CNNs were inspired by the organization of neurons in the visual cortex of the animal brain. As a result, they provide some very interesting features that are useful for processing certain types of data like images, audio and video.

Mainly three main types of layers are used to build ConvNet architectures: Convolutional Layer, Pooling Layer, and Fully-Connected Layer (exactly as seen in regular Neural Networks). We will stack these layers to form a full ConvNet architecture.  A simple ConvNet for CIFAR-10 classification could have the above architecture [INPUT – CONV – RELU – POOL – FC].

• INPUT [32x32x3] will hold the raw pixel values of the image, in this case an image of width 32, height 32, and with three color channels R,G,B.
• CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters.
• RELU layer will apply an elementwise activation function, such as the max(0,x)max(0,x)thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]).
• POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12].
• FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.

In this way, ConvNets transform the original image layer by layer from the original pixel values to the final class scores. Note that some layers contain parameters and others don’t. In particular, the CONV/FC layers perform transformations that are a function of not only the activations in the input volume, but also of the parameters (the weights and biases of the neurons). On the other hand, the RELU/POOL layers will implement a fixed function. The parameters in the CONV/FC layers will be trained with gradient descent so that the class scores that the ConvNet computes are consistent with the labels in the training set for each image.

Convolution is a technique that allows us to extract visual features from an image in small chunks. Each neuron in a convolution layer is responsible for a small cluster of neurons in the receding layer. CNNs work well for a variety of tasks including image recognition, image processing, image segmentation, video analysis, and natural language processing.

#### Recurrent Neural Network

The recurrent neural network (RNN), unlike feed forward neural networks, can operate effectively on sequences of data with variable input length.

The idea behind RNNs is to make use of sequential information. In a traditional neural network we assume that all inputs (and outputs) are independent of each other. But for many tasks that is a very bad idea. If you want to predict the next word in a sentence you better know which words came before it. RNNs are called recurrent because they perform the same task for every element of a sequence, with the output being depended on the previous computations. Another way to think about RNNs is that they have a “memory” which captures information about what has been calculated so far. This is essentially like giving a neural network a short-term memory. This feature makes RNNs very effective for working with sequences of data that occur over time, For example, the time-series data, like changes in stock prices, a sequence of characters, like a stream of characters being typed into a mobile phone.

The two variants on the basic RNN architecture that help solve a common problem with training RNNs are Gated RNNs, and Long Short-Term Memory RNNs (LSTMs). Both of these variants use a form of memory to help make predictions in sequences over time. The main difference between a Gated RNN and an LSTM is that the Gated RNN has two gates to control its memory: an Update gate and a Reset gate, while an LSTM has three gates: an Input gate, an Output gate, and a Forget gate.

RNNs work well for applications that involve a sequence of data that change over time. These applications include natural language processing, speech recognition, language translation, image captioning and conversation modeling.

#### Conclusion

So this article was about various Deep Learning techniques. Each technique is useful in its own way and is put to practical use in various applications daily. Although deep learning is currently the most advanced artificial intelligence technique, it is not the AI industry’s final destination. The evolution of deep learning and neural networks might give us totally new architectures. Which is why more and more institutes are offering courses on AI and Deep Learning across the world and in India as well. One of the best and most competent artificial intelligence certification in Delhi NCR is DexLab Analytics. It offers an array of courses worth exploring.

.

## Computer Vision and Image Classification -A study

Computer vision is the field of computer science that focuses on replicating parts of the complexity of the human vision system and enabling computers to identify and process objects in images and videos in the same way that humans do. With computer vision, our computer can extract, analyze and understand useful information from an individual image or a sequence of images. Computer vision is a field of artificial intelligence that works on enabling computers to see, identify and process images in the same way that human vision does, and then provide the appropriate output.

Initially computer vision only worked in limited capacity but due to advance innovations in deep learning and neural networks, the field has been able to take great leaps in recent years and has been able to surpass humans in some tasks related to detecting and labeling objects.

#### The Contribution of Deep Learning in Computer Vision

While there are still significant obstacles in the path of human-quality computer vision, Deep Learning systems have made significant progress in dealing with some of the relevant sub-tasks. The reason for this success is partly based on the additional responsibility assigned to deep learning systems.

It is reasonable to say that the biggest difference with deep learning systems is that they no longer need to be programmed to specifically look for features. Rather than searching for specific features by way of a carefully programmed algorithm, the neural networks inside deep learning systems are trained. For example, if cars in an image keep being misclassified as motorcycles then you don’t fine-tune parameters or re-write the algorithm. Instead, you continue training until the system gets it right.

With the increased computational power offered by modern-day deep learning systems, there is steady and noticeable progress towards the point where a computer will be able to recognize and react to everything that it sees.

#### Application of Computer Vision

The field of Computer Vision is too expansive to cover in depth.  The techniques of computer vision can help a computer to extract, analyze, and understand useful information from a single or a sequence of images. There are many advanced techniques like style transfer, colorization, action recognition, 3D objects, human pose estimation, and much more but in this article we will only focus on the commonly used techniques of computer vision. These techniques are: –

• Image Classification
• Image Classification with Localization
• Object Segmentation
• Object Detection

So in this article we will go through all the above techniques of computer vision and we will also see how deep learning is used for the various techniques of computer vision in detail. To avoid confusion we will distribute this article in a series of multiple blogs. In first blog we will see the first technique of computer vision which is Image Classification and we will also explore that how deep learning is used in Image Classification.

#### Image Classification

Image classification is the process of predicting a specific class, or label, for something that is defined by a set of data points. Image classification is a subset of the classification problem, where an entire image is assigned a label. Perhaps a picture will be classified as a daytime or nighttime shot. Or, in a similar way, images of cars and motorcycles will be automatically placed into their own groups.

There are countless categories, or classes, in which a specific image can be classified. Consider a manual process where images are compared and similar ones are grouped according to like-characteristics, but without necessarily knowing in advance what you are looking for. Obviously, this is an onerous task. To make it even more so, assume that the set of images numbers in the hundreds of thousands. It becomes readily apparent that an automatic system is needed in order to do this quickly and efficiently.

There are many image classification tasks that involve photographs of objects. Two popular examples include the CIFAR-10 and CIFAR-100 datasets that have photographs to be classified into 10 and 100 classes respectively.

#### Deep learning for Image Classification

The deep learning architecture for image classification generally includes convolutional layers, making it a convolutional neural network (CNN). A typical use case for CNNs is where you feed the network images and the network classifies the data. CNNs tend to start with an input “scanner” which isn’t intended to parse all the training data at once. For example, to input an image of 100 x 100 pixels, you wouldn’t want a layer with 10,000 nodes.

Rather, you create a scanning input layer of say 10 x 10 which you feed the first 10 x 10 pixels of the image. Once you passed that input, you feed it the next 10 x 10 pixels by moving the scanner one pixel to the right. This technique is known as sliding windows.

#### Following Layers are used to build Convolutional Neural Networks:

• INPUT [32x32x3] will hold the raw pixel values of the image, in this case an image of width 32, height 32, and with three color channels R,G,B.
• CONV layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [32x32x12] if we decided to use 12 filters.
• RELU layer will apply an element wise activation function, such as the max(0,x)max(0,x)thresholding at zero. This leaves the size of the volume unchanged ([32x32x12]).
• POOL layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12].
• FC (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.

#### Output of the Model History

In this way, ConvNets transform the original image layer by layer from the original pixel values to the final class scores. Note that some layers contain parameters and other don’t. In particular, the CONV/FC layers perform transformations that are a function of not only the activations in the input volume, but also of the parameters (the weights and biases of the neurons). On the other hand, the RELU/POOL layers will implement a fixed function. The parameters in the CONV/FC layers will be trained with gradient descent so that the class scores that the ConvNet computes are consistent with the labels in the training set for each image.

#### Conclusion

The above content focuses on image classification only and the architecture of deep learning used for it. But there is more to computer vision than just classification task. The detection, segmentation and localization of classified objects are equally important. We will see these in next blog.

## Decoding Advanced Loss Functions in Machine Learning: A Comprehensive Guide

Every Machine Learning algorithm (Model) learns by the process of optimizing the loss functions. The loss function is a method of evaluating how accurate the given prediction is made. If predictions are off, then loss function will output a higher number. If they’re pretty good, it’ll output a lower number. If someone makes changes in the algorithm to improve the model, loss function will show the path in which one should proceed.

Machine Learning is growing as fast as ever in the age we are living, with a host of comprehensive Machine Learning course in India pacing their way to usher the future. Along with this, a wide range of courses like Machine Learning Using Python, Neural Network Machine Learning Python is becoming easily accessible to the masses with the help of Machine Learning institute in Gurgaon and similar institutes.

We are having different types of loss functions.

• Regression Loss Functions
• Binary Classification Loss Functions
• Multi-class Classification Loss Functions

#### Regression Loss Functions

1. Mean Squared Error
2. Mean Absolute Error
3. Huber Loss Function

#### Binary Classification Loss Functions

1. Binary Cross-Entropy
2. Hinge Loss

#### Multi-class Classification Loss Functions

1. Multi-class Cross Entropy Loss
2. Kullback Leibler Divergence Loss

#### Mean Squared Error

Mean squared error is used to measure the average of the squared difference between predictions and actual observations. It considers the average magnitude of error irrespective of their direction.

This expression can be defined as the mean value of the squared deviations of the predicted values from that of true values. Here ‘n’ denotes the total number of samples in the data.

#### Mean Absolute Error

Absolute Error for each training example is the distance between the predicted and the actual values, irrespective of the sign.

### MAE = | y-f(x) |

Absolute Error is also known as the L1 loss. The MAE cost is more robust to outliers as compared to MSE.

#### Huber Loss

Huber loss is a loss function used in robust regression. This is less sensitive to outliers in data than the squared error loss. The Huber loss function describes the penalty incurred by an estimation procedure f. Huber (1964) defines the loss function piecewise by:

This function is quadratic for small values of a, and linear for large values, with equal values and slopes of the different sections at the two points where |a|= 𝛿. The variable “a” often refers to the residuals, that is to the difference between the observed and predicted values a=y-f(x), so the former can be expanded to: –

#### Binary Classification Loss Functions

Binary classifications are those predictive modelling problems where examples are assigned one of two labels.

#### Binary Cross-Entropy

Cross-Entropy is the loss function used for binary classification problems. It is intended for use with binary classification.

Mathematically, it is the preferred loss function under the inference framework of maximum likelihood. Cross-entropy will calculate a score that summarizes the average difference between the actual and predicted probability distributions for predicting class 1. The score is minimized and a perfect cross-entropy value is 0.

#### Hinge Loss

The hinge loss function is popular with Support Vector Machines (SVMs). These are used for training the classifiers,

### l(y) = max(0, 1- t•y)

where ‘t’ is the intended output and ‘y’ is the classifier score.

Hinge loss is convex function but is not differentiable which reduces its options for minimizing with few methods.

#### Multi-Class Classification Loss Functions

Multi-Class classifications are those predictive modelling problems where examples are assigned one of more than two classes.

#### Multi-Class Cross-Entropy

Cross-Entropy is the loss function used for multi-class classification problems. It is intended for use with multi-class classification.

Mathematically, it is the preferred loss function under the inference framework of maximum likelihood. Cross-entropy will calculate a score that summarizes the average difference between the actual and predicted probability distributions for all classes. The score is minimized and a perfect cross-entropy value is 0.

#### Kullback Leibler Divergence Loss

KL divergence is a natural way to measure the difference between two probability distributions.

A KL divergence loss of 0 suggests the distributions are identical. In practice, the behaviour of KL Divergence is very similar to cross-entropy. It calculates how much information is lost (in terms of bits) if the predicted probability distribution is used to approximate the desired target probability distribution.

There are also some advanced loss functions for machine learning models which are used for specific purposes.

1. Robust Bi-Tempered Logistic Loss based on Bregman Divergences
2. Minimax loss for GANs
3. Focal Loss for Dense Object Detection
4. Intersection over Union (IoU)-balanced Loss Functions for Single-stage Object Detection
5. Boundary loss for highly unbalanced segmentation
6. Perceptual Loss Function

#### Robust Bi-Tempered Logistic Loss based on Bregman Divergences

In this loss function, we introduce a temperature into the exponential function and replace the softmax output layer of the neural networks by a high-temperature generalization. Similarly, the logarithm in the loss we use for training is replaced by a low-temperature logarithm. By tuning the two temperatures, we create loss functions that are non-convex already in the single-layer case. When replacing the last layer of the neural networks by our bi-temperature generalization of the logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large datasets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method that uses the Tsallis divergence.

#### Minimax loss for GANs

Minimax GAN loss refers to the minimax simultaneous optimization of the discriminator and generator models.

Minimax refers to an optimization strategy in two-player turn-based games for minimizing the loss or cost for the worst case of the other player.

For the GAN, the generator and discriminator are the two players and take turns involving updates to their model weights. The min and max refer to the minimization of the generator loss and the maximization of the discriminator’s loss.

#### Focal Loss for Dense Object Detection

The Focal Loss is designed to address the one-stage object detection scenario in which there is an extreme imbalance between foreground and background classes during training (e.g., 1:1000). Therefore, the classifier gets more negative samples (or more easy training samples to be more specific) compared to positive samples, thereby causing more biased learning.

The large class imbalance encountered during the training of dense detectors overwhelms the cross-entropy loss. Easily classified negatives comprise the majority of the loss and dominate the gradient. While the weighting factor (alpha) balances the importance of positive/negative examples, it does not differentiate between easy/hard examples. Instead, we propose to reshape the loss function to down-weight easy examples and thus, focus training on hard negatives. More formally, we propose to add a modulating factor (1 − pt) γ to the cross-entropy loss, with tunable focusing parameter γ ≥ 0.

We define the focal loss as

### FL(pt) = −(1 − pt) γ log(pt)

#### Intersection over Union (IoU)-balanced Loss Functions for Single-stage Object Detection

The IoU-balanced classification loss focuses on positive scenarios with high IoU can increase the correlation between classification and the task of localization. The loss aims at decreasing the gradient of the examples with low IoU and increasing the gradient of examples with high IoU. This increases the localization accuracy of models.

#### Boundary loss for highly unbalanced segmentation

Boundary loss takes the form of a distance metric on the space of contours (or shapes), not regions. This can mitigate the difficulties of regional losses in the context of highly unbalanced segmentation problems because it uses integrals over the boundary (interface) between regions instead of unbalanced integrals over regions. Furthermore, a boundary loss provides information that is complementary to regional losses. Unfortunately, it is not straightforward to represent the boundary points corresponding to the regional softmax outputs of a CNN. Our boundary loss is inspired by discrete (graph-based) optimization techniques for computing gradient flows of curve evolution.

Following an integral approach for computing boundary variations, we express a non-symmetric L2L2 distance on the space of shapes as a regional integral, which avoids completely local differential computations involving contour points. This yields a boundary loss expressed with the regional softmax probability outputs of the network, which can be easily combined with standard regional losses and implemented with any existing deep network architecture for N-D segmentation. We report comprehensive evaluations on two benchmark datasets corresponding to difficult, highly unbalanced problems: the ischemic stroke lesion (ISLES) and white matter hyperintensities (WMH). Used in conjunction with the region-based generalized Dice loss (GDL), our boundary loss improves performance significantly compared to GDL alone, reaching up to 8% improvement in Dice score and 10% improvement in Hausdorff score. It also yielded a more stable learning process.

#### Perceptual Loss Function

We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a \emph{per-pixel} loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing \emph{perceptual} loss functions based on high-level features extracted from pre-trained networks. We combine the benefits of both approaches and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.

#### Conclusion

Loss function takes the algorithm from theoretical to practical and transforms neural networks from matrix multiplication into deep learning. In this article, initially, we understood how loss functions work and then, we went on to explore a comprehensive list of loss functions also we have seen the very recent — advanced loss functions.

References: –

https://arxiv.org
https://www.wikipedia.org

## What is a Neural Network?

Before we get started with the process of building a Neural Network, we need to understand first what a Neural Network is.

A neural network is a collection of neurons connected by synapses. This collection is organized into three main layers: the input layer, the hidden layer, and the output layer.

In an artificial neural network, there are several inputs, which are called features, producing a single output, known as a label.

#### Analogy between Human Mind and Neural Network

Scientists believe that a living creature’s brain processes information through the use of a biological neural network. The human brain has as many as 100 trillion synapses – gaps between neurons – which form specific patterns when activated.

In the field of Deep Learning, a neural network is represented by a series of layers that work much like a living brain’s synapses. It is becoming a popular course now, with an array of career opportunities. Thus, Deep learning Certification in Gurgaon is a must for everyone.

Scientists use neural networks to teach computers how to do things for themselves. The whole concept of Neural network and its varied applications are pretty interesting. Moreover, with the matchless Neural Networks Training in Delhi, you need not look any further.

There are numerous kinds of deep learning and neural networks:

1. Feedforward Neural Network – Artificial Neuron
2. Radial basis function Neural Network
3. Kohonen Self Organizing Neural Network
4. Recurrent Neural Network (RNN) – Long Short Term Memory
5. Convolutional Neural Network
6. Modular Neural Network

#### Working of a Simple Feedforward Neural Network

1. It takes inputs as a matrix (2D array of numbers).
2. Multiplies the input by a set weight (performs a dot product aka matrix multiplication).
3. Applies an activation function.
4. Returns an output.
5. Error is calculated by taking the difference from the desired output from the data and the predicted output. This creates our gradient descent, which we can use to alter the weights.
6. The weights are then altered slightly according to the error.
7. To train, this process is repeated 1,000+ times. The more the data is trained upon, the more accurate our outputs will be.

#### Implementation of a Neural Network with Python and Keras

Keras has two types of models:

• Sequential model
• The model class used with functional API

Sequential model is probably the most used feature of Keras. Primarily, it represents the array of Keras Layers. It is convenient and builds different types of Neural Networks really quick, just by adding layers to it. Keras also has different types of Layers like Dense Layers, Convolutional Layers, Pooling Layers, Activation Layers, Dropout Layers etc.

The most basic layer is Dense Layer. It has many options for setting the inputs, activation function and so on. So, let’s see how one can build a Neural Network using Sequential and Dense.

First, let’s import the necessary code from Keras:

After this step, the model is ready for compilation. The compilation step asks to define the loss function and the kind of optimizer which should be used. These options depend on the problem one is trying to solve.

Now, the model is ready to get trained. Thus, the parameters get tuned to provide the correct outputs for a given input. This can be done by feeding inputs at the input layer and then, getting an output.

After this one can calculate the loss function using the output and use backpropagation to tune the model parameters. This will fit the model parameters to the data.

Output of the above cell:-

This output shows the loss decrease and the accuracy increase over time. At this point, one can experiment with the hyper-parameters and neural network architecture to get the best accuracy.

After getting the final model architecture, one can now take the model and use feed-forward passes and predict inputs. To start making predictions, one can use the testing dataset in the model that has been created previously. Keras enables one to make predictions by using the .predict() function.

#### Some points to be remembered while building a strong Neural Network

1. Adding Regularization to Fight Over-Fitting

The predictive models mentioned above are prone to a problem of overfitting. This is a scenario whereby the model memorizes the results in the training set and isn’t able to generalize on data that it hasn’t seen.

In neural networks, regularization is the technique that fights overfitting by adding a layer in the neural network. It can be done in 3 ways:

• L1 Regularization
• L2 Regularization
• Dropout Regularization

Out of these, Dropout is a commonly used regularization technique. In every iteration, it adds a Dropout layer in the neural network and thereby, deactivates some neurons. The process of deactivating neurons is usually random.

2. Hyperparameter Tuning

Grid search is a technique that you can use to experiment with different model parameters to obtain the ones that give you the best accuracy. This is done by trying different parameters and returning those that give the best results. It helps in improving model accuracy.

#### Conclusion

Neural Network is coping with the fast pace of the technology of the age remarkably well and thereby, inducing the necessity of courses like Neural Network Machine Learning PythonNeural Networks in Python course and more. Though these advanced technologies are just at their nascent stage, they are promising enough to lead the way to the future.

In this article, Building and Training our Neural Network is shown. This simple Neural Network can be extended to Convolutional Neural Network and Recurrent Neural Network for more advanced applications in Computer Vision and Natural Language Processing respectively.

Reference Blogs:

https://keras.rstudio.com

## Deep Learning and its Progress as Discussed at Intel’s AI Summit

At the latest AI summit organized by Intel, Mr. Naveen Rao, Vice President and General Manager of Intel’s AI Products Group, focused on the most vibrant age of computing that is the present age we are living. According to Rao, the widespread and sudden growth of neural networks is putting the capability of the hardware into a real test. Therefore, we now have to reflect deeply on “how processing, network, and memory work together” to figure a pragmatic solution, he said.

The storage of data has seen countless improvements in the last 20 years. We can now boast of our prowess of handling considerably large sets of data, with greater computing capability in a single place. This led to the expansion of the neural network models with an eye on the overall progress in neural Network Machine Learning Python and computing in general.

With the onset of exceedingly large data sets to work with, Deep learning for Computer Vision Course and the other models of Deep Learning to recognize speech, images, and text are extensively feeding on them. The technological giants were undoubtedly the early birds to grab the technical: the hardware and the software configuration to have an edge on the others.

Surely, Deep Learning is on its peak now, where computers can identify the images with incredible vividness. On the other hand, chatbots can carry on with almost natural conversations with us. It is no wonder that the Deep learning Training Institutes all over the world are jumping in the race to bring all of these new technologies efficiently to the general mass.

#### The Big Problem

We are living in the dynamic age of AI and Machine Learning, with the biggies like Google, Facebook, and its peers, having the technical skills and configuration to take up the challenges. However, the neural networks have fattened up so much lately that it has already started to give the hardware a tough time, getting the better of them all the time.

The number of parameters of the Neural network models is increasing as never before. They are “actually increasing on the order of 10x year on year”, as per Rao. Thus, it is a wall looming in AI. Though Intel is trying its best to tackle this obvious wall, which might otherwise give the industry a severe setback, with extensive research to bring new chip architectures and memory technologies into play, it cannot solve the AI processing problem single-handedly. Rao concluded on a note of requesting the partners in the present competitive scenario.

+91 931 572 5902