online courses Archives - Page 15 of 16 - DexLab Analytics | Big Data Hadoop SAS R Analytics Predictive Modeling & Excel VBA

The First Sci-Fi Movie Written By An Algorithm!

If you thought algorithms were these boring formulas that have no point, then it is time for you to rethink. Algorithms these days are replacing the need for human intellect altogether (do you smell a hint of irony in there?). As much as this may sound totally unbelievable but it is in fact the truth. The first ever short film has been made that is completely written by an algorithm, and is finally out and it is pretty great. Although the convoluted dialogues of the screenplay needs a few mental gymnastics to be performed by the audience’s brains, but then again isn’t that the best thing about sci-fi movies after all?

 

The First Sci-Fi Movie Written By An Algorithm!

 

The story behind this amazing film starts with an aspiring film-maker Oscar Sharp, named appropriately for the movie-making world (Again irony… Get it? Oscar… movies?!). This young screenplay writer wanted to submit a plot for Sci-Fi London’s 48 Hour Film Challenge. So, to create the next big hit in the world of Sci-Fi short films he read all the screenplays he could find, of the best sci-fi movies on the internet. Then the right inspiration struck him and he came up with the idea to feed all these movie plots into an algorithm, starting from movies like Ghostbusters to The X-Flies and movies like Interstellar, The Fifth Element and many more and let the algorithm work up on a script on its own. Ingenious you say or was it a complete idiocy?  Only to be discovered later that it was a great idea only with the right algorithm. Continue reading “The First Sci-Fi Movie Written By An Algorithm!”

The Most Interesting Scientific Fact About Numbers a Data Scientist can Know

Are you a non-data person? Do you think numbers are unnecessarily complex and are better to stay away from? When asked to calculations do you feel a little of sleepiness setting in? Then this would come as a surprise to you that our brains originally think in numbers. To be more precise, our brains actually think in the Logarithm Scale instead of thinking in the additive scale. To put it simply our brains understand better in terms of proportions than in differences.

 

The most interesting scientific fact about numbers a data scientist can know

 

So, how would our brains approach differences in numbers? We think almost automatically that the difference between 1 and 2 is greater than the difference between 3 and 2 and so forth.

Continue reading “The Most Interesting Scientific Fact About Numbers a Data Scientist can Know”

Infographic: List of Our Courses from Dexlab Analytics

infographic6

 

View our infographic on the list of most in-demand courses among data enthusiasts who want to advance their careers with the top trends in the job market. Come to DexLab Analytics and launch your career in light speed with the requisite skills to harness the power of data.

 

Our Courses: MS Excel, Data Science, Business Analytics

Are You a Student of Statistics? – You must know these 3 things

Are you a student of statistics?

We a premiere statistical and data analysis training institute offering courses on Big Data Hadoop, Business intelligence and Ai. We asked our faculty to tell us the three most important things that every student of elementary statistics should know.

So, let us get on with it:

  1. The notion that statistics is about numbers, is in the context only: statistics involves a rich treasure trove of numeric and graphical representation of displaying data to quantify them also it is very important to be capable of generating graphs along with numbers. But that is not the half part of statistics and the main interesting aspect is related to making the big leap from numbers and graphs to the realistic worldly interpretations. Uncannily statistics also poses to be a fascinating philosophical tension raising the question and healthy skepticism about we believe in and what we do not.
  2. The analysis part is not the most crucial part of a statistical study, the most important part lies with the when, where and how of gathering the data. We must not forget when we enter each number or data, calculate and plot the strategies we build on our understanding, but many a times at the time of interpretation that each every graph, data or number is a product of a fallible machine, be it organic or mechanical. If we are able to take proper care at the stage of sampling and observation we will be able to obtain great dividends at the final stage of interpretation and analysis of all our statistical efforts.
  3. All statistical functions off all kinds of mathematical sciences are based on a two-way communication system. This communication system should be between the statistician and non-statistician end. The main aim of statistical analysis is to put forward important social, public and scientific questions. A good statistician knows how to communicate with the public especially with those who are by and large not statisticians. Also the public here plays an important role and must possess simple idea of statistical conclusions to grasp what the statisticians have to say to them. This is an important criterion to be incorporated in the K-12 and college curriculum for elementary statistical students.

Data Science Machine Learning Certification

If you agree with our views and would like to discuss further on statistics and its application on data analysis then feel free drop by DexLab Analytics and stay updated on the latest trends in data management and mining.

 

Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

5 Online Sources to Get Basic Hadoop Introduction

Basic Hadoop Courses

Big data Hadoop courses are hitting it big in the world of business whether it is healthcare, manufacturing, media or marketing. Data is generated everywhere, and Hadoop is a readily available open source Apache software program that can be utilized to crunch and store Big Data sets.

As per reports from the Transparency Market Research the forecast shows a promising growth opportunity from the existing USD 1.5 million back in 2012 to USD 20.8 million within 2018. These promising growth numbers suggest that there will be an increased need for human resources to manage, develop and oversee all the Hadoop implementations.

#BigDataIngestion: DexLab Analytics Offers Exclusive 10% Discount for Students This Summer

DexLab Analytics Presents #BigDataIngestion

Many experts believe that one can learn any new subject by simple self-study if only you invest enough time and sincere predisposition towards a topic. After all self-study is actually what a person does to acquire knowledge about any given topic. Be it how to fix a leaky faucet or learn a new language or learn strum a guitar. Studying is on one’s own in any case. But to be an expert in a given field, you have to study on your own while you also need to invest your energy in the right direction. And to know the right direction, you need a mentor or a guide to lead the way.

But if you want to test the waters, and tinker with Hadoop to understand its basics, you can go through the wide range of documents available at the Apache Hadoop website for your perusal. Also try downloading the Hadoop open source release to get the feel of the program while tinkering with different features.

Here are 5 online sources where you can seek some basic introduction to Hadoop for big data:

  1. IBM’s open sources, Hadoop Big Data for the Impatient is a good option to go through the basics of Hadoop. It also offers a free download of Hadoop image (you might need Cloudera) to help you work with examples of Hadoop-based problems. You will also be able to get an idea of Hive, Oozie, Pig and Sqoop. The course is available in Vietnamese, Chinese, Spanish and Portuguese.
  2. Cloudera offers a Cloudera essentials course for Apache Hadoop. Apache Hadoop chapter wise video tutorials are available with Cloudera essentials. But this course is mainly targeted at administrators and those who are well-acquainted with data science, to update their skills on the subject.
  3. YouTube also offers a long list of videos on Hadoop topics for beginners. Some are good while others may not be so helpful for the Hadoop virgins. Simply type Hadoop and you will find a never-ending list of videos related to Hadoop. Some are quite useful for clarifying simple doubts related to Hadoop.
  4. Udemy is another site where you can get some free videos as well as a few for a fee. Simply put Hadoop free on the search bar at their homepage and see what comes up.
  5. Udacity was developed by Silicon Valley giants like FaceBook, Cadence, Twitter and the likes. They offer a 14-day free trial with free course materials. But you will need to pay for the course if you do not finish the course within 14 days.

 

Seeking a good and reliable Hadoop training in Delhi? When DexLab Analytics is here, why look further! Being a recognized Big Data Hadoop institute in Gurgaon, the courses are truly interesting.

 

Interested in a career in Data Analyst?

To learn more about Machine Learning Using Python and Spark – click here.
To learn more about Data Analyst with Advanced excel course – click here.
To learn more about Data Analyst with SAS Course – click here.
To learn more about Data Analyst with R Course – click here.
To learn more about Big Data Course – click here.

Elementary Character Functions in SAS

Basically the number of functions present in the SAS program amount to three. They are Character Functions, Numeric Functions and Date and Time Functions. In this post we are going to take a brief look at Character functions of a basic nature.

 

Elementary Character Function  in SAS

 

Character Functions

Suppose that there is this program with the following lines of command:

Data Len_func ; input name $ ; cards; Sandeep Baljeet
Neeta
.
;
run;
data Len_func; set Len_func ; Len=length(name);
Len_N=lengthn(name); Len_C=lengthc(name); run;
proc print; run;

Here,

  • The function called LENGTH returns the character value’s length.
  • The function LENGTHN is more or less identical to the LENGTH function. The sole difference between the two lies in the fact that for a value missing character it returns the length that equals to 0 whereas LENGTH returns a value of 1.
  • The function LENGTHC returns to the program the storage length of particular strings.

 

The ABC of Summary Statistics and T Tests in SAS – @Dexlabanalytics.

 

Again let us consider the following lines of code:

Data case ; input name $ ;cards;
sandeep baljeet neeta
;
New_U=upcase(name); New_P=propcase(name); 
run;
proc print; run;

 

Data Preparation using SAS – @Dexlabanalytics.

 

Here the following functions are introduced:

  • The function UPCASE converts all of the letters to the uppercase.
  • The function PROPCASE serves to capitalize the first letter of all words and converts the remaining to lowercase.
  • As might be guess from the convention conformed to while naming the function, LOWCASE transforms all letters to their lowercase counterparts.

 

In the following program commands:

Data AMOUNTS; input NAME $20.; cards;
RAD-HIKA SHARMA RAJARAM PAND-IT SURESH
AA-RT-I
;
RUN;
Data AMOUNTS;
Set AMOUNTS; NAME1=COMPRESS(NAME,'-'); NAME2=COMPBL(NAME);
RUN;
PROC PRINT;
RUN;

 

Here’s why SAS Analytics Is a Must-Have IT Skill to Possess – @Dexlabanalytics.

 

Here we can see the following syntax:

  1. Compress (Variable, ”want to remove”);
  2. Compbl (Variable)
  • The function COMPRESS removes blanks by default. It can also remove a particular specified character value as indicated by the code. In the example cited the character value ‘-‘is compressed.
  • On the other hand the COMPBL function serves to result in a single blank from multiple ones.

 

For expert guidance, you will be well advised to enroll yourself in a SAS course from a reputed SAS Training institute. You may consider DexLab Analytics if you are in the vicinity of Delhi or noida.

 

Interested in a career in Data Analyst?

To learn more about Machine Learning Using Python and Spark – click here.
To learn more about Data Analyst with Advanced excel course – click here.
To learn more about Data Analyst with SAS Course – click here.
To learn more about Data Analyst with R Course – click here.
To learn more about Big Data Course – click here.

Twelve Great Free R Programming E-books

To Big Data enthusiasts R is word or rather a letter that needs no introduction. R programming is a programming language that brings the complex world of statistics and datasets at your fingertips. It is mainly used for computing statistics and relevant graphics. The following twelve e-books are not only useful to bring you up to the task for R programming but best of all they are free.

 

Twelve Great Free R Programming E-books

 

  • Learning Statistics with R
    Author: Daniel Navarro

If you are looking for a guide that will take you through the intricacies of developing software with R be it the basic types and structures of data to more complex topics like recursion, closures as well as anonymous functions. Knowledge of statistics, although helpful, is not an essential pre-requisite .

Continue reading “Twelve Great Free R Programming E-books”

Top 10 Best Hadoop EBooks That You Should Start Reading Now

Top 10 Best Hadoop EBooks That You Should Start Reading Now

Based on Java, Hadoop is a free open source framework for programming where dealings with huge amounts of processed data in a computing environment is said to be distributed. None other than the Apache Software Foundation is sponsoring it. If you are looking for information about Hadoop, you will like to get in-depth information about the framework and its associated functions. To get you up to the mark with the concepts, the eBooks listed below will prove to be of invaluable help.

2

MapReduce

If you are looking forward to get started with Hadoop, and maximize your knowledge about Hadoop clusters, this book is of right fit. The book is loaded with information on how t o effectively use the framework to scale apps of the tools provided by Hadoop. This ebook lets you get acquainted with the intricacies of Hadoop with instructions provided on a step-by-step basis and guides you from being a Hadoop newbie to efficiently run and tackle complex Hadoop apps across a large number of machine clusters.

Also read: Big Data Analytics and its Impact on Manufacturing Sector

Programming Pig

Prog_pig_comp.indd

If you are looking for a reference from which you may learn more about Apache Pig, which happens to be the engine powering executions of parallel flows of data on the Hadoop framework which also is open source, the Programming Pig is meant for you. Not only does it serve the interests of new users but also provides advanced users coverage on the most important functions like the “Pig Latin” scripting language, the “Grunt” shell and the functions defined by users for extending Pig even further. After reading this book, analyzing terabytes of data is a far less tedious task.

Also read: What Sets Apart Data Science from Big Data and Data Analytics

Professional Hadoop Solutions

51gb9XbHEmL._SX396_BO1,204,203,200_

This book covers a gamut of topics such as that how to store data with Hbase and HDFS, processing the data with the help of MapReduce and data processing automation with Oozie. Not limiting to that the book further covers the security features of Hadoop, how it goes along with Amazon Web Services, the best related practices and how to automate in real time the Hadoop processes. It provides code examples in XML and Java and refers to them in-depth along with what has been added to the Hadoop ecosystem of late. The eBook positions itself as comprehensive resource with API coverage and exposition of the deeper intricacies, which allow developers and architects to better customize and leverage them.

Also read: How To Stop Big Data Projects From Failing?

Apache Sqoop cookbook

9781449364625

This guide allows the user to use Sqoop from Apache with emphasis on application of parameters that are enabled by the Command Line Interface when dealing with cases that are used commonly. The authors offer Oracle, MySQL as well as PostgreSQL examples of databases on GitHub that lend themselves to be easily adapted for Netezza, SQL Server, Teradata etc relational systems.

Also read: Why Getting a Big Data Certification Will Benefit Your Small Business

Hadoop MapReduce Cookbook

51CBDiRJBPL._SX342_QL70_

The preface of the book claims that the book enables readers to know how to process complex and large datasets. The book starts simple but still gives detailed knowledge about Hadoop. Further, the book claims to be a simple guide on getting things done in one place. It consists of 90 recipes that are offered simply and in a straightforward manner, coupled with systematic instructions and examples from the real world.

Also read: How to Code Colour Values Within SAS Enterprise Guide

Hadoop: The Definitive Guide, 2nd Ed

9200000035483086

If you want to know how to maintain and build distributed systems that are both scalable and reliable within the framework of Hadoop then this book is for you. It is intended for – programmers who want to analyze datasets, irrespective of size; and – administrators, who seek to know the setting up and running of Hadoop Clusters, alike. New features like Sqoop, Hive as well as Avro are dealt with in the new second edition. Case studies are also included that may help you out with specific problems.

Also read: How to Use PUT and %PUT Statements in SAS: 6 Tips

MapReduce Design Pattern

19057545

If one is to go by the book’s preface, the book is a blend of familiarity and uniqueness. The book is dedicated to design patterns by which we refer to the general guides or templates for solving problems. It is however more open-ended in nature than a “cookbook” as problems are not specified. You have to delve more in the subject matter than mere copying and pasting, but a pattern will get you covered about 90% of the whole way regardless of the challenge at hand.

Also read: SAS Still Dominates the Market After Decades of its Inception

Hadoop Operations

lrg (1)

This book is necessary for those who seek to maintain complex and large clusters of Hadoop. Map Reduce, HDFS, Hadoop Cluster Planning. Hadoop Installation as well as Configuration, Authorization and authentication, Identity, Maintenance of clusters and management of resources are all dealt in it.

Also read: Things to judge in SAS training centres

Programming Hive

programming-hive-repost-5332.jpeg

Knowledge on programming in Hive provides an SQL dialect in order to query data, which is stored in HDFS, which makes it an indispensable tool at the hands of Hadoop experts. It also works to integrate with other file systems, which may be associated with Hadoop. Examples of such file systems may be MapR-FS and the S3 from Amazon as well as Cassandra and HBase.

Hadoop Real World Solutions CookBook

Hadoop-Real-World-Solutions-Cookbook

The preface of this eBook illustrates its use. It lets developers get acquainted and become proficient at problem solving in the Hadoop space. The reader will also get acquainted with varied tools related to Hadoop and the best practices to be followed while implementing them. The tools included in this cookbook are inclusive of Pig, Hive, MapReduce, Giraph, Mahout, Accumulo, HDFS, Ganglia and Redis. This book intends to teach readers what they need to know to apply Hadoop knowledge to solve their own set of problems.

 

So, happy reading!

 

Enjoy 10% Discount, As DexLab Analytics Launches #BigDataIngestion

DexLab Analytics Presents #BigDataIngestion

 

Besides, feeding knowledge through eBooks, it is vital to be enrolled for an excellent Big data hadoop certification in Gurgaon. DexLab Analytics is here for you; it offers a gamut of high-end big data hadoop training in Delhi, courses that will surely hone your data skills.

 

Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

Data Preparation using SAS

Data Preparation using SAS

Before doing any data analysis, there are tasks which are critical to the success of the data analysis project. That critical task is known as data preparation. You may have heard that in the last years the data production is expanding at an astonishing pace. Experts now point to a 4300% increase in annual data generation by 2020. This can be due to the switch from analog to digital technologies and the rapid increase in data generation by individuals and corporations alike. The most of the data generated in the last few years are unstructured.

sass

In the above context, it is highly important to prepare your data from the unstructured dataset to a structured dataset to do a meaningful analysis.

“Data preparation means manipulation of data into a form suitable for further analysis and processing”

“Data Preparation techniques consists of Cleaning, Integration, Selection and Transformation”

We will discuss some of the data preparation techniques in SAS using SAS. INFORMAT is used to read the data with special characters. FORMAT is used to display the data with special characters.

 

Data DP.Practice;

length City $10.;
 input City $ ID $ Age Salary DOJ Profit;
 informat Salary dollar6. DOJ ddmmyy10. Profit dollar7.2;
 format Salary dollar6. DOJ ddmmyy10. Profit dollar7.2;
 label DOJ = "Date of Joining";
 rename Salary = Salary_of_Employee;
 datalines;
 Bangalore T101 24 $2,000 12/12/2010 $300.50
 Pune T102 29 $3,000 11/10/2006 $400.50
 Hyderabad T103 $5,000 12/10/2008 $500.70
 Delhi T104 $6,000 12/12/2009 $450.00
 Pune T105 $7,000 12/12/2009 $450.00
 ;
 run;

 

On the above SAS code, we have used both the INFORMAT and FORMAT to read and display the data with special characters. The SAS INFORMAT statement read the salary as numeric variable and in a specific format i.e. $5,000 which is of 6 characters including $. The FORMAT statement displays the same in your input data. Rename and label statements helps modify the variables metadata for further understanding of the dataset.

2

We will apply some transformations techniques in a dataset which helps us to apply some advanced analytical techniques in the data. We have a dataset that has various attributes of a customer who has subscribed or not subscribed an edition. In our dataset we have a categorical variable status which holds the observation either “Subscribed” or “Not Subscribed”.  We can transform the categorical variable into a dichotomous variable to run a logistic regression on our dataset.

 

Data media01;
 set DP.media;
 length status $15;
 If status =”subscribed” then status = “0”;
 else status = “1”;
 run;

 

On the above SAS code, we have applied simple If Else statements to transform our dataset called media. Transforming a categorical variable into a dichotomous variable helps us to apply the analytical techniques that we want to run in our dataset. Once after the transformation is done, the dataset is good to go for the next stage i.e. data analysis.

The more you torture your data i.e. Data Preparation, the more the success on the outcome of the data analysis.

 

DexLab Analytics offer state of the art SAS training courses. They are a premier SAS training institute that caters to the needs of their students round the clock.

 

Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

Call us to know more