Covid-19 Archives - DexLab Analytics | Credit Risk | Market Risk | SAS Python Machine Learning Modeling

Time Series Analysis Part I

Posted on January 18, 2021January 18, 2021 by Dexlab

A time series is a sequence of numerical data in which each item is associated with a particular instant in time. Many sets of data appear as time series: a monthly sequence of the quantity of goods shipped from a factory, a weekly series of the number of road accidents, daily rainfall amounts, hourly observations made on the yield of a chemical process, and so on. Examples of time series abound in such fields as economics, business, engineering, the natural sciences (especially geophysics and meteorology), and the social sciences.

Univariate time series analysis- When we have a single sequence of data observed over time then it is called univariate time series analysis.
Multivariate time series analysis – When we have several sets of data for the same sequence of time periods to observe then it is called multivariate time series analysis.

The data used in time series analysis is a random variable (Yt) where t is denoted as time and such a collection of random variables ordered in time is called random or stochastic process.

Stationary: A time series is said to be stationary when all the moments of its probability distribution i.e. mean, variance , covariance etc. are invariant over time. It becomes quite easy forecast data in this kind of situation as the hidden patterns are recognizable which make predictions easy.

Non-stationary: A non-stationary time series will have a time varying mean or time varying variance or both, which makes it impossible to generalize the time series over other time periods.

Non stationary processes can further be explained with the help of a term called Random walk models. This term or theory usually is used in stock market which assumes that stock prices are independent of each other over time. Now there are two types of random walks:
Random walk with drift : When the observation that is to be predicted at a time ‘t’ is equal to last period’s value plus a constant or a drift (α) and the residual term (ε). It can be written as
Yt= α + Yt-1 + εt
The equation shows that Yt drifts upwards or downwards depending upon α being positive or negative and the mean and the variance also increases over time.
Random walk without drift: The random walk without a drift model observes that the values to be predicted at time ‘t’ is equal to last past period’s value plus a random shock.
Yt= Yt-1 + εt
Consider that the effect in one unit shock then the process started at some time 0 with a value of Y0
When t=1
Y1= Y0 + ε1
When t=2
Y2= Y1+ ε2= Y0 + ε1+ ε2
In general,
Yt= Y0+∑ εt
In this case as t increases the variance increases indefinitely whereas the mean value of Y is equal to its initial or starting value. Therefore the random walk model without drift is a non-stationary process.

So, with that we come to the end of the discussion on the Time Series. Hopefully it helped you understand time Series, for more information you can also watch the video tutorial attached down this blog. DexLab Analytics offers machine learning courses in delhi. To keep on learning more, follow DexLab Analytics blog.

Complete study of COVID-19 in India (Part II) – Laboratory and Testing

Posted on May 13, 2020May 23, 2020 by Dexlab

The first case of the 2019-2020 Coronavirus pandemic in India was reported on January 30, 2020, originating in China. Experts suggest the number of infections could be much higher as India’s testing rates are among the lowest in the world. The infection rate of COVID-19 in India is 1.7, significantly lower than in the worst affected countries.

The World Health Organisation chief executive director of health emergencies program Michael Ryan said that India has “tremendous capacity” to deal with the coronavirus outbreak, and as the second most populous country, will have enormous impact on the world’s ability to deal with it.

DexLab Analytics, in the first part of this blog series, studied the statewise breakup of COVID-19 cases in India through a Jupyter Notebook. Libraries were called, maps were drawnand data was taken from Kaggle.

The data and code sheet can be found below.

In this part of the blog series we will study how states are performing with regard to laboratories and testing. First we make three data sets – that of confirmed cases, recovered cases and cases of deaths.

We first plot this data on a graph and study it carefully. Then we make a pivot table and study the data. We then also study which state is performing how many tests on people. Kerala is found to have done the maximum number of tests (Fig.1.).

Fig. 1.

The purpose of this video is to teach you how to use visual graphs in Python. Now we aim to find why testing is underdone in states. Is there a possibility of a lesser number of labs in the first place? We get a graph (Fig. 2.) that shows us how many labs each state has for testing COVID-19 samples.

Fig. 2.

For the complete study watch the video attached herewith. This study was brought to you by DexLab Analytics. DexLab Analytics is a premiere Artificial Intelligence training institute in Gurgaon.

Complete Statewise Study on COVID-19 in India (Part I)

Posted on May 11, 2020July 7, 2020 by Dexlab

The first case of the 2019-2020 Coronavirus pandemic in India was reported on January 30, 2020, originating in China. Experts suggest the number of infections could be much higher as India’s testing rates are among the lowest in the world. The infection rate of COVID-19 in India is 1.7, significantly lower than in the worst affected countries.

The World Health Organisation chief executive director of health emergencies program Michael Ryan said that India has “tremendous capacity” to deal with the coronavirus outbreak, and as the second most populous country, will have enormous impact on the world’s ability to deal with it.

Other commentators worried about the economic devastation caused by the lockdown that has huge effects on informal workers, micro and small enterprises and farmers and self employed people who are left without a livelihood in the absence of transportation and access to markets.

The lockdown was justified by the government and other agencies for being pre-emptive to prevent India from entering a higher stage which could make handling very difficult and cause even more losses thereafter. According to a study by Shiv Nadar University, India could have witnesses a surge of 31,000 cases between March 24 and April 14 without lockdown.

So we call a Jupyter Notebook in Python to study India’s COVID-19 story.

The data and code sheet used in this study can be found below.

We will first import all libraries like pandas and numpy. All the data has been taken from kaggle. We then take the data and work a dataframe on it. And then we generate an India map to study the spread of SARS-CoV-2.

Fig. 1.

For more on this, please watch the complete video attached herewith. This study was brought to you by DexLab Analytics. DexLab Analytics is a premiere Artificial Intelligence training institute in Gurgaon.

A Deep Dive Into The US Healthcare System in New York

Posted on May 7, 2020May 23, 2020 by Dexlab

Unlike India’s healthcare system wherein both public and private entities deliver healthcare facilities to citizens, in the US, the healthcare sector is completely privatised.

The aim of this notebook is to study some of the numerical data we have for the US and especially data for New York. Most of us know about New York’s situation that is one of the worst in the world.

Therefore, analysing data may clarify a few things. We will be using three sets of data – urgent care facilities, US county healthcare rankings 2020 and Covid sources for counties.

For the data and codesheet click below.

Now pick key column names for your study with ‘.keys’ as the function name. We are interested in a few variables from health rankings so we take only the ones we think will be useful in a new data frame.

We will study each data set one by one so that we can get an understanding of the data before combining them. For this we call the plotly library that has very interactive graphs. We use the choropleth to generate a heat map over the country in question.

Fig. 1.

It is clear form the heat map that New York has a very high incidence of infections vis a vis other states. We then begin working with data on the number of ICU beds in each state. Since each state will have different populations, we cannot compare the absolute number of ICU beds. We need the ratio of ICU beds per a given number of inhabitants.

Fig. 2.

The generated heat map (Fig. 2.) shows the ICU density per state in the US. For more on this do watch the complete video tutorial attached herewith.

This tutorial was brought to you by DexLab Analytics. DexLab Analytics is a premiere data analyst training institute in Gurgaon.

Covid-19 – Key Insights through Exploration of Data (Part – II)

Posted on May 4, 2020May 23, 2020 by Dexlab

This video tutorial is on exploratory data analysis. The data is on COVID-19 cases and it has been taken from Kaggle. This tutorial is based on simple visualization of COVID-19 cases.

For code sheet and data click below.

Firstly, we must call whatever libraries we need in Python. Then we must import the data we will be working on onto our platform.

Now, we must explore PANDAS. For this it is important to know that there are three types of data structures – Series, Data Frame and Panel Data. In our tutorial we will be using data frames.

Fig. 1.

Now we will plot the data we have onto a graph. When we run the program, we get a graph that shows total hospital beds, potentially available hospital beds and available hospital beds.

Fig. 2.

While visualizing data we must remember to keep the data as simple as possible and not make it complex. If there are too many data columns the interpretation will be a very complex one, something we do not want.

Fig. 3.

A scatter plot (Fig. 3.) is also generated to show the reading of the data available. We study the behaviour of the data on the plot.

For more on this, view the video attached herewith. And practise more and more with data from Kaggle. This tutorial was brought to you by DexLab Analytics. DexLab Analytics is a premiere data analyst training institute in Gurgaon.

The Impact of latitude on The Spread of COVID-19 (Part-I)

Posted on April 29, 2020May 4, 2020 by Dexlab

The COVID-19 pandemic has hit us hard as a people and forced us to bow down to the vagaries of nature. As of April 29, 2020, the number of persons infected stands at 31,39,523 while the number of persons dead stands at 2,18,024 globally.

This essay is on the phenomenon of detecting geographical variations in the mortality rate of the COVID-19 epidemic. This essay explores a specific range of latitudes along which a rapid spread of the infection has been detected with the help of data sets on Kaggle. The findings are Dexlab Analytics’ own. Dexlab Analytics is a premiere institute that trains professionals in python for data analysis.

For the code sheet and data used in this study, click below.

The instructor has imported all Python libraries and the visualisation of data hosted on Kaggle has been done through a heat map. The data is listed on the basis of country codes and their latitudes and there is a separate data set based on the figures from the USA alone.

Fig. 1.

The instructor has compared data from amongst the countries in one scenario and among states in the USA in another scenario. Data has been prepared and structured under these two heads.

Fig. 2.

The instructor has prepared the data according to the mortality rate of each country and it is updated to the very day of working on the data, i.e. the latest updated figures are presented in the study. When the instructor runs the program, a heat map is produced.

For more on this, do go through the half-an-hour long program video attached herewith. The rest of the essay will be featured in subsequent parts of this series of articles.

Call us to know more

Gurgaon

Kolkata