sas training delhi Archives - DexLab Analytics | Big Data Hadoop SAS R Analytics Predictive Modeling & Excel VBA

How to Create a Self-similar Christmas Tree Using SAS

The season of celebration is here! Christmas is just around the corner. And here is a beautiful self-similar Christmas tree that was created using SAS with the help of two fascinating features – matrix computations and ODS statistical graphics.

How to Create a Self-similar Christmas Tree Using SAS

Experiencing self-similarity in Kronecker products

You might have come across blogposts highlighting how repeated Kronecker product of a binary matrix ushers us into self-similar structures. But this time, we are taking special care in introducing self-similarity in Kronecker products – like, if M is a binary matrix then Kronecker product happens to be M@M, which is a block matrix in which each 1 inside the original matrix is replaced by a copy of M, and each O is replaced with a zero block. In this blog, @ stands for the Kronecker product (or direct product) operator implemented in IML/SAS software). Continue reading “How to Create a Self-similar Christmas Tree Using SAS”

Wake Up to a World of Data Possibilities: With SAS Certification

Of late, in spite of trending insurgence of cutting edge technology tools, SAS remains one of the most popular, in-demand programming languages for advanced analytics. It’s been more than two decades, yet it didn’t lose its importance in ruling the data science market. This shows how flexible this pioneering analytics tool is, and how adaptable it is in its functionality that it stood strong through the test of time and development.

 
Wake Up to a World of Data Possibilities: With SAS Certification

Possess the Right SAS Skills, Be In Demand

Organizations are utilizing the perks of advanced analytics inside out. They are realizing that not only big data analytics has secured a niche area of concentration for itself, but it has strived to be an indispensable part of any organization that is on its walk to success.

Continue reading “Wake Up to a World of Data Possibilities: With SAS Certification”

INTCK and INTNX: All about SAS Dates and Computing Intervals between Dates

INTCK-and-INTNX

The INTCK and INTNX functions in SAS helps you compute the time between events. This technical blog is based on the timeline of living US presidents, sourced from a Wikipedia table. The table data shows the number of years and days between events.

So, let’s start.

LivingPresidents2

Gaps between dates

To calculate the interval between two dates, you can use these two SAS functions:

The INTCK function returns the number of time units between dates. The time unit can be selected in years, months, weeks, days, or whatever you feel like.

The INTNX function helps you compute the date that is 308 days away in the future from a specific date. This was just an example to help you understand what it means. The INTNX function returns a SAS date that is particular number of time units away from a particular date.

These two functions share a complimentary bond: where one calculates the difference between two dates, the other entitles you to add time units to a specified date value. Also, the INT part in both the functions denotes INTervals, and the terms INTCK and INTNX means Interval Check and Interval Next, respectively.

DexLab Analytics offers intensive SAS certification courses for candidates..

How to calculate anniversary dates

These two prime functions tend to be useful in counting the number of anniversaries between two dates along with calculating a future anniversary date. Use the ‘CONTINUOUS’ option for the INTCK function and the ‘SAME’ option for the INTNX function in the following manner:

The ‘CONTINUOUS’ option in the INTCK function helps you count the number of anniversaries of one date that occur before a second date. For example, the statement

Years = intck('year', '30APR1789'd, '04MAR1797'd, 'continuous');

returns the value 7 because there are 7 full years (anniversaries of 30APR) between those two dates. Without the ‘CONTINUOUS’ option, the function returns 8 as 01JAN occurs 8 times between those dates.

The statement

Anniv = intnx('year', '30APR1789'd, 7, 'same');

returns the 7th anniversary of the date 30APR1789. In some ways, it returns the date value for 30APR1796.

The most exciting part about these two functions is that they automatically handle leap years! Yes, you read that right. If you ask for the number of days within two dates, the INTCK function will show leap days in the result. If an event takes place on a leap day, and you ask the INTNX function to reveal the anniversary date, it will report 28FEB of the next year to the next anniversary date.

An algorithm calculating years and days between events

Go through the following algorithm to calculate the number of years and days between dates in SAS:

  • Use the INTCK function with the ‘CONTINUOUS’ option to calculate the number of completed years between two dates
  • Use the INTNX function to discover a third date, i.e. anniversary date, which is the same month and day like the start date, but takes place less than a year before the end date.
  • Use the INTCK function to ascertain the number of days occurring between the anniversary date and the end date.

Here are the data steps that enable you to compute the time interval in years and days between the first few US presidential inaugurations and deaths.

data YearDays;
format Date prevDate anniv Date9.;
input @1  Date anydtdte12.
      @13 Event $26.;
prevDate = lag(Date);
if _N_=1 then do;                               /* when _N_=1, lag(Date)=. */
   Years=.; Days=.; return;            /* set years & days, go to next obs */
end;
Years = intck('year', prevDate, Date, 'continuous'); /* num complete years */
Anniv = intnx('year', prevDate, Years, 'same');      /* most recent anniv  */
Days = intck('day', anniv, Date);                    /* days since anniv   */
datalines;
Apr 30, 1789 Washington Inaug
Mar 4, 1797  J Adams Inaug
Dec 14, 1799 Washington Death
Mar 4, 1801  Jefferson Inaug
Mar 4, 1809  Madison Inaug
Mar 4, 1817  Monroe Inaug
Mar 4, 1825  JQ Adams Inaug
Jul 4, 1826  Jefferson Death
Jul 4, 1826  J Adams Death
run;
 
proc print data=YearDays;
var Event prevDate Date Anniv Years Days;
run;

 

LivingPresidents3

 

In a nutshell, the INTCK and INTNX functions are consequential for calculating intervals between dates. In this blog, I discussed about two-less-popular options inn SAS, for more such SAS training related blogs, follow us at DexLab Analytics.

Data Science Machine Learning Certification

This post originally appeared onblogs.sas.com/content/iml/2017/05/15/intck-intnx-intervals-sas.html
 

Interested in a career in Data Analyst?

To learn more about Machine Learning Using Python and Spark – click here.
To learn more about Data Analyst with Advanced excel course – click here.
To learn more about Data Analyst with SAS Course – click here.
To learn more about Data Analyst with R Course – click here.
To learn more about Big Data Course – click here.

The ABC of Summary Statistics and T Tests in SAS

The ABC of Summary Statistics and T Tests in SAS

Getting introduced to statistics for SAS training? Then, you must know how to create summary statistics (such as sample size, mean, and standard deviation) to test hypotheses and to figure confidence intervals. In this blog, we will show you how to furnish summary statistics (instead of raw data) to PROC TTEST in SAS, how to develop a data set that includes summary statistics and how to run PROC TTEST to calculate a two-sample or one-sample t test for the mean.

So, let’s start!

2

Running a two-sample t test for difference of means from summarized statistics

Instead of going the clichéd way, we will start with establishing a comparison between the mean heights of 19 students, based on gender – the data is held in the Sashelp class data set.

Observe the below SAS statements that sorts the data by the grouping variable, calling PROC MEANS and printing a subset of the statistics:

proc sort data=sashelp.class out=class; 
   by sex;                                /* sort by group variable */
run;
proc means data=class noprint;           /* compute summary statistics by group */
   by sex;                               /* group variable */
   var height;                           /* analysis variable */
   output out=SummaryStats;              /* write statistics to data set */
run;
proc print data=SummaryStats label noobs; 
   where _STAT_ in ("N", "MEAN", "STD");
   var Sex _STAT_ Height;
run;

summarystats1

The table reflects the structure of the Summary Stats set for two sample tests. The two samples used here are differentiated on the levels of the Sex Variable (‘F’ for females and ‘M’ for males). The _STAT_ column shows the name of the statistic implemented here. The Height column depicts the value of the statistics for individual group.

Get SAS certification Delhi from DexLab Analytics today!

The problem: The heights of sixth-grade students are normally distributed. Random samples of n1=9 females and n2=10 males are selected. The mean height of the female sample is m1=60.5889 with a standard deviation of s1=5.0183. The mean height of the male sample is m2=63.9100 with a standard deviation of s2=4.9379. Is there evidence that the mean height of sixth-grade students depends on gender?

Here, you have to do nothing special to get the PROC TTEST – whenever the procedure gets the sight of the respective variable _STAT_ and any unique values, the procedure understands that the data set comprises summarized statistics. The following representation compares the mean heights of males and females:

proc ttest data=SummaryStats order=data
           alpha=0.05 test=diff sides=2; /* two-sided test of diff between group means */
   class sex;
   var height;
run;

summarystats1

Check the confidence intervals for the standard deviations and also that the output includes 95% confidence intervals for group means.

In the second table, the ‘Pooled’ row radiates out the impression that both the variances of two groups are more or less equal, which is somewhat true even. The value of the t statistic is t = -1.45 with a two-sided p-value of 0.1645.

The syntax for the PROC TTEST statement allows you to change the type of hypothesis test and the significance level. To support this, you can now run a one-sided test for the alternative hypothesis μ1 < μ2 at the 0.10 significance level just by using:

proc ttest ... alpha=0.10 test=diff sides=L;  /* Left-tailed test */

Running a one-sample t test of the mean from summarized statistics

In the above section, you have learnt to create the summary statistics from PROC MEANS. Nevertheless, you can also generate the summary statistic manually, if you lack original data.

The problem: A research study measured the pulse rates of 57 college men and found a mean pulse rate of 70.4211 beats per minute with a standard deviation of 9.9480 beats per minute. Researchers want to know if the mean pulse rate for all college men is different from the current standard of 72 beats per minute.

The following statements jots down the summary statistics for a data set, asks PROC TTEST to perform a one-sample test of the null hypothesis μ = 72 against a two-sided alternative hypothesis:

data SummaryStats;
  infile datalines dsd truncover;
  input _STAT_:$8. X;
datalines;
N, 57
MEAN, 70.4211
STD, 9.9480
;
 
proc ttest data=SummaryStats alpha=0.05 H0=72 sides=2; /* H0: mu=72 vs two-sided alternative */
   var X;
run;

summarystats3 (2)

The outcome is a 95% confidence interval for the mean containing a value 72. The value of the t statistic is t = -1.20, which corresponds to a p-value of 0.2359. Therefore, the data fails in rejecting the null hypothesis at the 0.05 significance level.

For more informative blogs and news about SAS course, drop by our prime SAS predictive modeling training institute DexLab Analytics.

 
This post originally appeared onblogs.sas.com/content/iml/2017/07/03/summary-statistics-t-tests-sas.html
 

Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

New and Improved Data Pane in SAS Visual Analytics Now Goes Painless

New-and-Improved-Data-Pane-in-SAS--Visual-Analytics-Now-Goes-Painless
 

It seems some good news is waiting for you – honing your data for effective reports are easier now with the 8.1 release of SAS Visual Analytics. In this technical blog, we will understand the structure of data pane, how it exhibits data from an active data source, and a handful number of tasks, which you might want to perform – like viewing measure details, adjusting data item properties and fabricating geographic data items, custom categories and hierarchies.

Continue reading “New and Improved Data Pane in SAS Visual Analytics Now Goes Painless”

Celebrate Christmas in Data Analyst Style With SAS!

Christmas is just at the end of this week, so we at team DexLab decided to help our dear readers who love some data-wizardry, with some SAS magic! You can choose to flaunt your extra SAS knowledge to your peer groups with the below described SAS program.

 

Celebrate Christmas in Data Analyst Style With SAS!
Celebrate Christmas in Data Analyst Style With SAS!

We are taking things a tad backwards by trying to, almost idiosyncratically complicate things that are otherwise simple. After all some say, a job of a data analyst is to do so! However, be it stupid or unnecessary this is definitely by far the coolest way to wish Merry Christmas, in data-analyst style.

Continue reading “Celebrate Christmas in Data Analyst Style With SAS!”

The Right Tool For Statistical Analysis SAS Vs. Stata

Both SPSS and SAS have been around in the world of statistical analysis for several years now, so, the conundrum of which is better software for statistical analysis is an age-old question among data people.

 

The Right Tool For Statistical Analysis SAS Vs. Stata

 

To begin with SAS is in its version 9+ and has also enhanced its visual appeal greatly. But SPSS still comes with its popular “click and get results” interface. SPSS has also moved beyond its version 15.0+ and has also began adding different modules like its competitor SAS. Continue reading “The Right Tool For Statistical Analysis SAS Vs. Stata”

Data Preparation using SAS

Data Preparation using SAS

Before doing any data analysis, there are tasks which are critical to the success of the data analysis project. That critical task is known as data preparation. You may have heard that in the last years the data production is expanding at an astonishing pace. Experts now point to a 4300% increase in annual data generation by 2020. This can be due to the switch from analog to digital technologies and the rapid increase in data generation by individuals and corporations alike. The most of the data generated in the last few years are unstructured.

sass

In the above context, it is highly important to prepare your data from the unstructured dataset to a structured dataset to do a meaningful analysis.

“Data preparation means manipulation of data into a form suitable for further analysis and processing”

“Data Preparation techniques consists of Cleaning, Integration, Selection and Transformation”

We will discuss some of the data preparation techniques in SAS using SAS. INFORMAT is used to read the data with special characters. FORMAT is used to display the data with special characters.

 

Data DP.Practice;

length City $10.;
 input City $ ID $ Age Salary DOJ Profit;
 informat Salary dollar6. DOJ ddmmyy10. Profit dollar7.2;
 format Salary dollar6. DOJ ddmmyy10. Profit dollar7.2;
 label DOJ = "Date of Joining";
 rename Salary = Salary_of_Employee;
 datalines;
 Bangalore T101 24 $2,000 12/12/2010 $300.50
 Pune T102 29 $3,000 11/10/2006 $400.50
 Hyderabad T103 $5,000 12/10/2008 $500.70
 Delhi T104 $6,000 12/12/2009 $450.00
 Pune T105 $7,000 12/12/2009 $450.00
 ;
 run;

 

On the above SAS code, we have used both the INFORMAT and FORMAT to read and display the data with special characters. The SAS INFORMAT statement read the salary as numeric variable and in a specific format i.e. $5,000 which is of 6 characters including $. The FORMAT statement displays the same in your input data. Rename and label statements helps modify the variables metadata for further understanding of the dataset.

2

We will apply some transformations techniques in a dataset which helps us to apply some advanced analytical techniques in the data. We have a dataset that has various attributes of a customer who has subscribed or not subscribed an edition. In our dataset we have a categorical variable status which holds the observation either “Subscribed” or “Not Subscribed”.  We can transform the categorical variable into a dichotomous variable to run a logistic regression on our dataset.

 

Data media01;
 set DP.media;
 length status $15;
 If status =”subscribed” then status = “0”;
 else status = “1”;
 run;

 

On the above SAS code, we have applied simple If Else statements to transform our dataset called media. Transforming a categorical variable into a dichotomous variable helps us to apply the analytical techniques that we want to run in our dataset. Once after the transformation is done, the dataset is good to go for the next stage i.e. data analysis.

The more you torture your data i.e. Data Preparation, the more the success on the outcome of the data analysis.

 

DexLab Analytics offer state of the art SAS training courses. They are a premier SAS training institute that caters to the needs of their students round the clock.

 

Interested in a career in Data Analyst?

To learn more about Data Analyst with Advanced excel course – Enrol Now.
To learn more about Data Analyst with R Course – Enrol Now.
To learn more about Big Data Course – Enrol Now.

To learn more about Machine Learning Using Python and Spark – Enrol Now.
To learn more about Data Analyst with SAS Course – Enrol Now.
To learn more about Data Analyst with Apache Spark Course – Enrol Now.
To learn more about Data Analyst with Market Risk Analytics and Modelling Course – Enrol Now.

Call us to know more