Big Data News – 24 May 2016

Today's Infographic Link: Anatomy of Sun Storms & Solar Flares

Top Stories
Symbolic IO unveils what it describes as the first computational-defined storage system.

Market research firm IDC forecasts a 50% increase in revenues from the sale of big data and business analytics software, hardware, and services between 2015 and 2019. Services will account for the biggest chunk of revenue, with banking and manufacturing-led industries poised to spend the most.

Confluent, founded by the creators of Apache™ Kafka™, today announced the general availability of open source Confluent Platform 3.0. The new release introduces Kafka Streams , a powerful, lightweight solution to real-time processing designed to power a generation of highly responsive applications.

In this contributed article, Blair Linville, Chairman & Chief Executive Officer at Tectonic wants to get the record straight, once and for all, by debunking the 5 big data analytics myths.

SAP NS2's president and CEO talks about the company's interesting backstory.

Selecting the right data integration product is critical to meeting the increasing demand in companies for data that can help drive more informed business decisions. The tool you choose to integrate and translate this data into information that can generate actionable business insights must fulfill your organization's requirements. Otherwise, it will become expensive, unused shelfware.

Big data has moved from market hype to a valid competitive strategy, yet many challenges often hinder a successful big data project.

Confluent Platform 3.0 ships, with a GA release of Kafka Streams. And Confluent Enterprise now includes a new commercial tool: Confluent Control Center.

News: Platform tracks where content has been used in different social media websites.

ID Ransomware is a free website that has been launched by malware researches and where the victims of ransomware attacks can obtain help in dealing with their difficult situation. This website can make the life of victims easier by letting them identify the variant of the ransomware their computer has been affected with. This makes it quite simple to figure out a way in which one can recover the infected files without having to pay a ransom.

It's no secret that analytics is eating the enterprise world, but if there's anything in perpetually short supply, it's speed. Enter Cray, which on Tuesday unveiled a new supercomputing platform designed with that in mind. Dubbed Urika-GX, the new system is the first agile analytics platform to fuse supercomputing with an open, enterprise framework, Cray said. Due to be available in the third quarter, Urika-GX promises data scientists new levels of performance and the ability to find insight in massive data sets quickly.

Online reviews have already transformed the way people choose everything from restaurants to respiratory therapists, and now SaasGenius wants to do the same for enterprise software in the cloud. This week the company will launch a beta version of its service, and it invites participants to submit reviews of business software in 12 different categories. In the past, businesses looking for software relied primarily on word-of-mouth reviews, but SaasGenius aims to tap the model that's become so common on the consumer side.

News: Cray is combining supercomputing technology with connections to the Hortonworks Data Platform.

Since its inception in the year 2008, the global Hadoop market has observed growth at a tremendous pace. This market, valued US$1.5 billion in 2012, is estimated to grow at a CAGR of 54.7% from 2012 to 2018. By the end of 2018, this market could amass a net worth of US$20.9 billion. With the massive amount of data generated every day across major industries, the global Hadoop market is anticipated to observe significant growth in the future as well. Why Hadoop? Quite naturally, the mounting scales of unstructured data generated every single day from data-intensive industries such as telecommunication, banking and finance, social media, research, healthcare, and defence has led to the rising adoption of Hadoop solutions.

It's no secret that analytics is eating the enterprise world, but if there's anything in perpetually short supply, it's speed. Enter Cray, which on Tuesday unveiled a new supercomputing platform designed with that in mind. Dubbed Urika-GX, the new system is the first agile analytics platform to fuse supercomputing with an open, enterprise framework, Cray said. Due to be available in the third quarter, Urika-GX promises data scientists new levels of performance and the ability to find insight in massive data sets quickly. The system is tuned for highly iterative and interactive analytics, and integrated graph analytics offers rapid pattern matching.

Interview part two with Sumeet Singh – Senior Director, Cloud and Big Data Platforms @ Yahoo!Hopefully you enjoyed the first part of our interview with Sumeet, here is part two where we go into more detail about Yahoo's use of Hadoop, with lots of interesting topics coming up including the splintering of the ecosystem, governance and much much more.

"Your customer doesn't care how much you know until they know how much you care" – Damon Richards Customer experience is more significant than ever — in fact, according to Aspect, more than 60% of marketers agree that the importance of consumer experience has risen. With the emergence of the Big Data phenomenon, we now have more information than ever to study the actions and habits of customers in order to enhance their experience. By the year 2020, we will have more than 1.7 MB of data created every second for every person on the planet.

Here's this week's news in Data Science and Big Data. Don't forget to subscribe if you find this useful! Interesting Data Science Articles and News Building The Big Data Platform for Your Smart City — Using emerging technologies like Big Data and the Internet of Things, governments aim to build smart cities that understand their citizens and make fast decisions. These Headphones Know Your Ears Better than You Do — Nura, an Australian startup, built headphones that uses data from your ears to create headphones that adapt to your unique hearing.

Company successfully gains business visibility with unified data and reporting using ElegantJ BI tool that enabled quick-decision making across functions like Sales, Finance, Manufacturing , Procurement and HR

Despite the professed benefits, there remains a real uncertainty around how to leverage the technology and ecosystem to achieve business value from big data analytics. Many stakeholders understand the urgency around analytics for the strategic success and competitive edge of their organisation, but fail to understand how to extract significant value from data. We regularly hear how analytics investment is revealing customer needs and informing future product development and innovation.

While there has been much ado about interoperability, there are still no real solutions, same as last year and the year before that. The large EHR vendors who continue to dominate the market still maintain that interoperability is all but solved, still can't connect EHRs across the continuum causing frustration by providers and a disservice to patients. The ONC pays lip service to the problem, but that is about it. It is time for the healthcare industry to consider alternatives like middleware which has been proven in other industries such as finance, retail and hospitality.

There has been a lot of activity recently around revenue attribution – marketers want to develop a better understanding of their customer acquisition funnel and be able to measure progress against it.  Most of this attention has been focused on the B2C space. However, less work has been done measuring the performance of B2B marketing activities.   

My personal view is NO. It needs DevOps for sure, but not Devops Managers. There is a simple explanation and a more well defined, elaborate explanation to this question though. The simple explanation first. A manager's role needs to be associated with some outcome. A Project Manager – is responsible for the outcome of successful project, a Business Manager – is responsible for the outcome of a growing and profitable business, a Service Manager – is responsible for delivering improved service, etc. DevOps is not an outcome. It's a means to an end – that end being continuous delivery.

The analytics ecosystem today is evolving into a data fabric, where data is located in different places and processed in place with tools and technologies ideally suited to the specific type of analytics being performed — with all of this being transparent to the end users. But that is a recent development, for many years we have had only one tool in the tool box, the data warehouse. The data warehouse was often the only place in an organisation where data could come together, and where the production applications did not entirely consume the platform, enabling many users to access the warehouse and perform ad-hoc analytics, prototyping and analytic experiments.

Online reviews have already transformed the way people choose everything from restaurants to respiratory therapists, and now SaasGenius wants to do the same for enterprise software in the cloud. This week the company will launch a beta version of its service, and it invites participants to submit reviews of business software in 12 different categories. In the past, businesses looking for software relied primarily on word-of-mouth reviews, but SaasGenius aims to tap the model that's become so common on the consumer side. "We all now rely heavily on websites like Yelp and TripAdvisor in our personal lives — these sites feature trusted reviews to help us make quick, easy purchase decisions," said Tom Gorski, cofounder and CEO at the firm. "Even employees in large companies now expect a more customer-like online research and buying experience."

This is one of the first comprehensive machine learning, data science, statistical science, and computer science repository — featuring many brand new scalable, big-data algorithms published in the last two years, such as automated cataloging, causation detection, or model-free tests of hypotheses, in addition to the classics. The original title for this project was Handbook of Data Science, but over time, it grew much bigger than an handbook.  

In this special guest feature, Kylee Hall of Leadspace provides her top 5 tips for winning at demand generation with predictive applications.

The combination of Spark and Hadoop has supercharged big data analysis across many industries and use cases by lowering the barrier of entry to advanced analytics and thereby enabling data scientists to create data-driven products that weren't previously possible. But one area where Spark and Hadoop are having an especially strong impact revolves around cancer research. Cancer killed about 590,000 Americans last year, according to the Centers for Disease Control. That makes it the second leading causes of death in the United States, behind only heart disease, which killed 615,000. Forty-five years after President Nixon signed the National Cancer Act of 1971, which effectively was a declaration of war on cancer, progress against the collection of diseases has more or less stalled.

Another U.S. university is adding a data science specialization to its curriculum, this one as part of an online Masters of Science degree in engineering. The University of California at Riverside said the data science track was developed in collaboration with NASA's Jet Propulsion Laboratory (JPL) science staff. The partners said the data science program is aimed at engineers, scientists along with medical and social media professionals seeking to expand their training in data mining, data visualization, machine learning and statistical computing.

I spend a lot of time helping organizations to "think like a data scientist." My book "Big Data MBA: Driving Business Strategies with Data Science" has several chapters devoted to helping business leaders to embrace the power of data scientist thinking. My Big Data MBA class at the University of San Francisco School of Management focuses on teaching tomorrow's business executives the power of analytics and data science to optimize key business processes, uncover new monetization opportunities and create a more compelling, engaging customer and channel engagement.




Once a year, one of the smallest towns in Denmark becomes the fourth largest in the country during Europe's Roskilde Festival. Fast, scaled and optimized, data keeps all 130,000 guests safely fed, hydrated and informed–in a sustainable, efficient and well-organized way.

Proponents of zero rating may be right that the damage done is not onerous. Those against the practice may be right that it bestows unfair advantage.

Hardware-based encryption is a much stronger and more reliable option for protecting data than software-based encryption or worse no encryption at all.

For Cloudera, Apache HBase has grown into a stable, scalable, mature, and critical component of the Apache Hadoop stack.   HBase adds the ability to do low-latency random read/write across your big data. While it is a key piece of the Apache Hadoop ecosystem, HBase itself has an ecosystem of projects and products that use it as a storage engine for systems such as time series database (OpenTSDB), or SQL-style databases (Apache Phoenix, The post Apache HBase is Everywhere appeared first on Cloudera Engineering Blog.

Last week saw cloud ERP vendor NetSuite hold its annual user conference in San Jose. I've been attending SuiteWorld as long as SuiteWorld has actually existed, and hence it is interesting to reflect on the differences over the years. (Disclosure: NetSuite covered my travel and expenses to attend the event.) Last year's SuiteWorld marked the first outing for the company's newly minted chief marketing officer, Fred Studer. Studer, an alumnus of Microsoft, had a very different perspective on the message that NetSuite should be articulating, and SuiteWorld 2015 was a flashy affair with an aspirational tone that arguably left its traditional customers feeling a little bewildered.

Unlike most other statistical software packages, R doesn't have a native data file format. You can certainly import and export data in any number of formats, but there's no native "R data file…

How does a Grand Slam tournament reinvent itself to retain and grow its fan base around the world? By undergoing a digital transformation. Learn how IBM serves up real-time analytics on all platforms for the French Open.

Data application developers, data scientists and analytics professionals are driving their organizations' efforts to bridge their data to the cloud.

At a time when the consumer packaged goods market is undergoing tremendous changes, traditional approaches to market segmentation are not enough.

Despite the obvious and potential benefits for public agencies to move their IT services to cloud-based infrastructure, many are understandably reluctant to be an early adopter, which some believe potentially puts data at risk. Despite that reluctance, see why many public agencies, including law enforcement, are considering several key benefits that can sway them away from this conservative view of transformation.

Over the last five years, electronic health records (EHRs) have been widely implemented in the United States, and health care systems now have access to vast amounts of data. While they are beginning to apply "big data" techniques to predict individual outcomes like post-operative complications and diabetes risk, big data remains largely a buzzword, not… The post Making Predictive Analytics a Routine Part of Patient Care appeared first on Predictive Analytics Times.

Logicworks has developed Cloud Patrol, an addition to its line of managed services for AWS, that makes sure IT security policies are consistently applied.

The world's top authorities on Apache Hadoop convene at Hadoop Summit San Jose and one of the top questions that will be answered will be around the future and direction of Hadoop. Sanjay Radia – Founder and Architect, Hortonworks lead the track which selected 13 sessions around this topic. I asked Sanjay what he hoped would… The post The Future of Apache Hadoop appeared first on Hortonworks.

Avoid the productivity drain and customer frustration caused by content fragmentation. Keep the courtship period with customers alive by ensuring employees have unfettered access to enterprise content that can keep the customer experience harmonious. See how in a new IBM Enterprise Content Management ebook.

Avoid the productivity drain and customer frustration caused by content fragmentation. Keep the courtship period with customers alive by ensuring employees have unfettered access to enterprise content that can keep the customer experience harmonious. See how in a new IBM Enterprise Content Management ebook.

Sixty-four percent of IT and business pros say that self-service data analysis offers a significant competitive advantage over traditional methods.

By: Eric Siegel, Founder, Predictive Analytics World In anticipation of his upcoming conference presentation, Data Driven Selling: Enabling a Direct Salesforce with Tools that Re-Enforce Predictive Selling Methods at Predictive Analytics World Chicago, June 20-23, 2016, we asked Lawrence Cowan, Partner at Cicero Group, a few questions about his work in predictive analytics. Q: In your work with predictive analytics, what behavior or outcome do your models predict? A: A good portion of the advanced analytics work we do at Cicero deals with consumer behavior across all stages of the customer lifecycle.

Hyperconverged platforms are gaining ground in the data center as an alternative to more complex storage area network (SAN) architectures.

The average lifespan atop any corporate leaderboard, whether it's the Fortune 500 or any other peer group listing, is getting shorter and shorter. Business cycles are growing increasingly brief and volatile. Yesterday's top companies are soon long forgotten, and today's leaders are already watching out for tomorrow's darlings in their rearview mirror. Success is unfortunately a poor teacher. We become so focused on perfecting all of the things that made us successful in the first place that we don't notice when those things become less relevant in a changing environment. How is that some companies seem to defy the gravitational pull of these forces? How do some companies always find new ways to keep the growth engine going?

The Verizon DBIR has a lot to say about vulnerabilities. One of the more interesting topics is the large number of 2015 vulnerability exploits that were more than a year old. In a footnote the DBIR authors comment that "Those newly exploited CVEs, however, are mostly — and consistently — older than one year." The data show that more than 90% of exploited vulnerabilities in 2015 were more than one-year-old and nearly 20% were published more than 10 years ago. This data is consistent from year-to-year.

If you have an entrepreneurial streak, big data skills, and a startup dream, here's your chance to make it all happen. Go ahead, look at the sky and the stars and dream a big dream. Just be sure to also get your entry in by the deadline!

The Large Synoptic Survey Telescope (LSST), the largest digital camera in the world once completed in 2020, is slated to make the world's first motion picture of our universe. While some are fascinated and eager to see that "movie," it's the data that it will produce that will intrigue and occupy scientists for decades.

Dun & Bradstreet released a new report on how big data is actually being used within enterprises and the results show, well, that companies are essentially delusional. Seventy-three percent of those surveyed consider themselves "analytically driven," but fewer than half of them have more than 10 employees working with analytics.

Advera Health Analytics added coverage of multiple sclerosis drugs in its Evidex drug data and analytics platform. The new coverage of MS drugs renders on-demand, pooled analyses of MS drugs' clinical outcome measures, predictive identification of serious unknown risks, direct downstream medical cost calculations, and drug safety scorecards.

The abundance of data and the increased usage of data mean that every business now is a data driven analytics business, and every user is an analytics user.For the first time, in the 2016 Magic Quadrant for BI and Analytics, Gartner points to a fundamental shift away from traditional legacy platforms in favor of more modern BI solutions that deliver greater ease of use, speed and agility.

New perils in health data reporting crop up regularly, but a recent HIPAA compliant against Myriad Genetics for withholding variant data from patients is an unexpected development. The company did not include in its data report to patients any data deemed to be benign or clinically insignificant – but, as it turns out, the patients wanted that data too. Heads up: this could signify a major change in companies' claims of data ownership.

Enterprises are forecast to have 58 percent of their compute and data storage in the cloud in 10 years, compared with 28 percent currently according to a report covered in eweek.com.The adoption of the cloud has exceeded expectations and along with it comes a host of wonderful benefits to any size of business.These benefits include:

Any form of application used for data analysis is stringently dependant on its ability to retrieve queries fast. However, when working with larger or more complex datasets, as well as an increasing amount of concurrent users, the performance depends largely on the underlying analytical database — whether this is built into the application as part of a single-stack tool, or implemented via a separate data warehouse layer.

Logi Analytics and 1010data announced the release of a connector which enables Logi Analytics' self-service analytics to run on 1010data's ready-to-use massive data sets.

This entry was posted in News and tagged , , , , , , . Bookmark the permalink.