Modelling Archives - DexLab Analytics | Big Data Hadoop SAS R Analytics Predictive Modeling & Excel VBA

The Worst Techniques To Build A Predictive Model

While some of these techniques may be a little out of date and most of them have evolved over time greatly, for the past 10 years rendering most of these tools completely different and much more efficient to use. But here are few bad techniques in predictive modelling that are still widely in use in the industry:


Predictive Model


1. Using traditional decision trees: usually too large decision trees are usually really complex to handle and almost impossible to analyze for even the most knowledgeable data scientist. They are also prone to over-fitting which is why they are best avoided. Instead we recommend that you combine multiple small decision trees into one than using a single large decision tree to avoid unnecessary complexity.

Continue reading “The Worst Techniques To Build A Predictive Model”

Historians Make Use of Predictive Modeling

Predictive Modelling

Predictive modeling figures at the top of the list of new techniques put in to use by researchers in order to make out key archeological sites. The methodology used is not that complex. It makes predictions on the location of archeological sites having for its basis the qualities that are common to the sites already known. And the best news is that it works like a charm. A group of archeologists working in the company Logan Simpson which operates out of Utah discovered no less than 19 individual archeological sites containing many biface blades as well as stone points in addition to other artifacts that belong to the Paleoarchaic Period which ranges from 7,000 to 12,000 years ago.


The location of the site is about 160 km or 100 miles from Las Vegas, Nevada. The group of researchers also came across lakes and streams that disappeared long before. According to archeologists the sites were perhaps put into used by a number of groups of gatherers and hunters in the ancient times. The sites are scattered widely and also are scarce and could herald an understanding of the human activity that took place throughout the length and breadth of the Great Basin as a warmer climate prevailed after the end of the Ice Age. Their remoteness ensured that they remain unfound when traditional methods are employed.

How Predictive Analysis Could Have Saved the World from Ransomware – @Dexlabanalytics.

In Nevada’s Dry Lake Valley, Delamar Valley and Kane Springs archeologists have discovered sites like Clovis, Lake Mojave and also Silver Lake that contains some stone tools constructed according to styles prevalent as far as 12,000 years back.The project was funded by the Lincoln County Archeological Initiative from the Bureau of Land Management. It made use of GIS or geographic information system technology in order to make predictions about activity belonging to the Pleistocene-Holocene period.

Read Also: How Data Preparation Changed Post Predictive Analytics Model Implementation


The predictive modeling put into use took in to account the fact that the Great Basin was way more wet and cool at the end period of the Pleistocene than the climate prevalent today and in all probability had attracted the attention of gatherers and hunters for several centuries. The process of mapping with GIS and aerial pictures amongst others was followed by pinpointing and ranking the various locations that hold the most promise.

How Predictive Analysis Works With Data Mining – @Dexlabanalytics.

Apart from the Paleoarchaic era, artifacts belonging to relatively more recent periods in History were also found which bear out that the sites at the lakeside had been used over the course of several millennia.

But the most important discovery was the proof that that Predictive Modeling on the basis of GIS works well and should be included in the arsenal of tools of archeologists trying to discover prehistoric sites .

Read Also: Predictive Analytics: In conversation with Adam Bataran, Managing Director of GTM Global Salesforce Platforms at Bluewolf

Make predictive analytics your best friend for life and career with easy and comprehensive SAS training courses in Delhi by DexLab Analytics. For more information about this premier SAS training institute, log into their website.


Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

Things To Be Aware Of Regarding Hadoop Clusters

Hadoop is being increasingly used by companies of diverse scope and size and they are realizing that running Hadoop optimally is a tough call. As a matter of fact it is not humanly possible to respond to the changing conditions in real time as these may take place across several nodes in order to fix dips in performance or those that are causing bottlenecks. This performance degradation is exactly what needsto be critically remedied in cases where Hadoop is deployed on large scales where Hadoop is expected to deliver results critical to your business in the proper time. The following three signs signal the health of your Hadoop cluster.


hadoop clusters


  • The Out of Capacity Problem

The true test of your Hadoop infrastructure comes to fore when you are able to efficiently run all of your jobs and complete them within adequate time. In this it is not rare to come across instances where you have seemingly run out of capacity as you are unable to run additional application. However monitoring tools indicate that are not making full use of processing capability or other resources. The primary challenge that now lies before you is to sort out the root cause of the problem you have. Most often you will find them to be related to the YARN architecture that is used by Hadoop.YARN is static in nature and after the scheduling of jobs the process of adjusting system and network resources. The solution lies in configuring YARN to deal with worst case scenarios.

Continue reading “Things To Be Aware Of Regarding Hadoop Clusters”

Role of R In Business Intelligence

To put it simply Business Intelligence is the action of extracting and to derive information that may be of use from the available data. As might be evident the process is a broad one where the quality and the source of the data structure is variable. Transformations like this might in technical terms be described as ETL or extract, transform and load in addition to the presentation of information that is of use.


role of r in business intelligence

R Programming in Business Intelligence

Some R Programming Experts hold that R is fully able to take on the role of the engine for processes related to BI. Here we will focus only on the BI function of R i.e. to extract, transform load and present information and data. The following packages correspond to indicated processes in Business Intelligence.






  •  RODBC
  • DBI
  • data.table’s fread



In addition to these, there are several other packages that support data in a variety of formats.




  • data.table
  • dplyr




  • DBI


Let’s Take Your Data Dreams to the Next Level




Presenting data is a wholly different ball game than the previously mentioned process of ETL. Never fear, it may be outsourced with ease to tools of BI dashboard with ease by populating the structure of data according to the expectations of the particular data tool. R is able to create a dashboard of a web app directly from within itself through packages like:


  •  shiny
  • httpuv
  • opencpu
  • rook


These packages let you play host to interactive web apps. They have the ability to query the data in an interactive manner and generate interactive plots. The basis for all of these is an R session engine and is able to execute all functions of R and may leverage the capabilities of statistics of all packages in R.






The above mentioned packages serve as the core whose functionality may be simplified through the use of the packages mentioned below:


  • db.r
  • ETLUtils
  • Sqldf
  • Dplyr
  •  shinyBI
  • dwtools



The following factors are critical while R is adopted by businesses:


  • Extraction / Loading
  • Performance and scalability
  • Presentation
  • Support and licensing


For more details on R Programming, get yourself enrolled in superior R programming courses in Pune. R programming certification in Pune by DexLab Analytics is extremely popular.


Interested in a career in Data Analyst?

To learn more about Machine Learning Using Python and Spark – click here.
To learn more about Data Analyst with Advanced excel course – click here.
To learn more about Data Analyst with SAS Course – click here.
To learn more about Data Analyst with R Course – click here.
To learn more about Big Data Course – click here.

Will Spark Replace Hadoop?

Top 2016 Trends Expected to Turn Fruitful in 2017

I hope this post will help you to answer some questions related to Apache spark that might be coming into your mind these days related to Spark in Big Data Analytics.

Continue reading “Will Spark Replace Hadoop?”

How Data Scientists Take Their Coffee Every Morning

How Data Scientists Have Their Coffee

To a data scientist we are all sources of data, from the very moment we wake up in the morning to visit our local Starbucks (or any other local café) to get our morning coffee and swipe the screen of our tablets/iPads or smart phones to go through the big headlines for the day. With these few apparently simple regular exercises we are actually giving the data scientists more data which in-turn allows them to offer tailor-made news articles about things that interest us, and also prepares our favorite coffee blend ready for us to pick up every morning at the café.

The world of data science came to exist due to the growing need of drawing valuable information from data that is being collected every other day around the world. But is data science? Why is it necessary? A certified data scientist can be best described as a breed of experts who have in-depth knowledge in statistics, mathematics and computer science and use these skills to gather valuable insights form data. They often require innovative new solutions to address the various data problems.

Data Science: Is It the Right Answer? – @Dexlabanalytics.

As per estimates from the various job portals it is expected that around 3 million job positions are needed to be fulfilled by 2018 with individuals who have in-depth knowledge and expertise in the field of data analytics and can handle big data. Those who have already boarded the data analytics train are finding exciting new career prospects in this field with fast-paced growth opportunities. So, more and more individuals are looking to enhance their employability by acquiring a data science certification from a reputable institution. Age old programs are now being fast replaced by new comers in the field of data mining with software like R, SAS etc. Although SAS has been around in the world of data science for almost 40 years now, but it took time for it to really make a big splash in the industry. However, it is slowly emerging to be one the most in-demand programming languages these days.What a data science certification covers?

Tracing Success in the New Age of Data Science – @Dexlabanalytics.

This course covers the topics that enable students to implement advanced analytics to big data. Usually a student after completion of this course acquires an understanding of model deployment, machine language, automation and analytical modeling. Moreover, a well-equipped course in data science helps students to fine-tune their communication skills as well.

Keep Pace with Automation: Emerging Data Science Jobs in India – @Dexlabanalytics.

Things a data scientist must know:

All data scientists must have good mathematical skills in topics like: linear algebra, multivariable calculus, Python and linear algebra. For those with strong backgrounds in linear algebra and multivariable calculus it will be easy to understand all probability, machine learning and statistics in no time, which is a requisite for the job.

More and more data-hungry professionals are seeking excellent Data Science training in Delhi. If you are one of them, kindly drop by DexLab Analytics: we are a pioneering Data Science training institute. Peruse through our course details for better future.


Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

Call us to know more