Dexlab, Author at DexLab Analytics | Big Data Hadoop SAS R Analytics Predictive Modeling & Excel VBA - Page 80 of 80

Oil Price Crash – What Big Data Has To Do With It

It was presumed that the oil price collapse would result in giving the U.S. economy a boost and encourage consumer spending. But from the economic data it appeared that they are more interested in saving the money instead. But the Big Data from JP Morgan Chase Institute suggests otherwise. It seems to provide considerable evidence that consumers did spend most of what they saved. The result is arouses curiosity and demonstrates how the changes in the availability of data and computing power can affect the kind of energy research that is possible.

Education Key to Removing Inequality

The Questions That Arose

Till the recent past conventional wisdom stated that consumers refrained from spending what they saved from the collapse of oil prices. This was initially reflected by the numbers of their savings with fall in oil prices being accompanied by the rates of personal savings suggesting that customers were depositing the savings in bank accounts. Consumer survey data further confirmed the speculation. But most were still guessing at the reason. Was it due to the cold winter or was it a reaction to the financial crisis. But the question that mattered the most was whether consumers would eventually get to spending the excess money.

The New Answer

However as of now the conclusion reached by the JPMC research team is that 80% of the amount saved was were spent by consumers. They put into use records of transactions from over 25 million credit cards. This database alone provides them with a comprehensive window into the patterns of consumption. That along with some pretty smart analyses that enabled them to distinguish increase in spending lower prices of oil from the normal ones enabled them to complete this analysis. This would not even have been possible in absence of detailed records and the sheer computing power.

The Consequences

Though apparently, that is harbinger of good news, the economy is being stimulated due to a decrease in the prices of gasoline that makes other data harder to comprehend. If the consumers were adding up their savings it might be presumed that possibly they would get around to spending it as well. But according to JPMC that is not the case. It also raises questions on how reliable the consumer survey data really is. The increase in savings rate is also a puzzle facing researchers.

Big Data

The research may also be seen as an example of what may be made possible through the use of Big Data especially when it concerns research in Energy. The future portends well for gaining much more insight in to the general field of business and consumer decisions and much, much more. So it is necessary to stay up to date with the developments in this brave new world of Big Data.

Munich Re Bets its Big Data on SAS

Munich Re which one of the leading reinsurers in the world, has opted to deploy SAS in order to achieve the goal of its Big Data strategy. Business units and specialist departments across verticals are all set to use the SAS platform in order to carry out critical functions like forecasts, analyses, pattern recognition and simulations.

Quotes

The SAS software suite automates the whole process of acquisition as well as analysis of content derived from complex contracts as well as claim notifications. Having access to a large pool of data the company is better placed to innovate by making use of Big Data analytics. This will let it offer new and customized offers or proposals, Put in place for access throughout the world, the Analytics platform from SAS comes into play by accessing a considerable number of internal and external sources of data. Its flagship in-memory tech makes it possible to analyze huge data quantities of data interactively so as to be able to find new correlations that would otherwise be impossible to recognize in the absence of highly advanced tools for analytics. The in-database processing model allows development and management of data models to be directly run from the database itself. This in simple terms translates to that the analyses are our in the platform SAP HANA or its open-source counterpart, the Hadoop framework. These tools enable analysis of unstructured text data in massive quantities.

The factors which turned the decision for Munich Re in favor of SAS were the speed at which the analyses were carried out, the upward graph in the tech graph, the performance of the team for SAS overall and the ability of the system to deliver and deploy results swiftly.

The CEO for Munich Re Torsten Jeworrek attributed the success of their analysis of data to it and added that it contributed significantly to the value gotten by their customers. He also forecasted that with the adaptation of these new technologies the ability of Munich Re to combine the customer data and compare it with the expert knowledge and findings of the company.

How Hadoop makes Optimum Use of Distributed Storage and Parallel Computing

Hadoop is java based open source framework by Apache Software Foundation, It works on the principle of distributed storage and parallel computing for large datasets on commodity hardware.

Let’s take few core concepts of Hadoop in detail :

Distributed Storage – Here in Hadoop we deal with files of size TB or may be PB. We divide each file into parts and store them on multiple machines. It replicates each file by default 3 times (you can change replication factor as per your requirement) , 3 copies of each file minimizes the risk of data loss in Hadoop Eco system. In real life as you store a copy of car key at home to avoid problem in case your keys are lost

How Hadoop makes Optimum Use of Distributed Storage and Parallel Computing

Parallel Processing – We have progressed a lot in terms of storage space, processing power of processers but seek time of hard disk has not improved significantly to overcome this issue in Hadoop to read a file of 1 TB would take a long time by storing this file on 10 machines on a cluster, we can reduce seek time by upto 10 times.
HDFS has a minimum block size of 64MB to store large files in an optimized manner.

Let me explain you with some calculations:
Traditional System Hadoop System (HDFS)
File Size – 1TB (1000000000 KB) 1TB (1000000000 KB)
Windows Block Size – 8KB 64MB
No. of Blocks = 125000000 (1000000000 /8) 15625 (1000000000 /64000)
Assuming avg seek time = 4ms 4ms
Total Seek Time =125000000* 4 15625 * 4
= 500000000ms =62500ms
As you can see due to HDFS Block size of 64MB we could save 499937500ms (i.e. 99.98%of seek time) while reading 1TB of file in comparison to windows system.

We could further reduce seek time by dividing file into n parts and saving them on n no. of machines then seek time for 1TB file would be 62500/n ms.

Here you can see one use of parallel processing i.e. parallel reading of a file in cluster on multiple machines.
Parallel processing is a concept on which Map Reduce paradigm work in Hadoop, it distributes a job into multiple tasks for processing as a Map Reduce job more details in coming up blog for Map Reduce.

Commodity Hardware – It is the usual hardware that you use as your laptops / desktops in place of High Availability reliable IBM Machines. The use of commodity hardware has helped business hubs to save a lot of infrastructure cost. Commodity hardware is approx. 60% cheaper than High Availability reliable machine.

Call us to know more