Big Data News – 25 Aug 2015

Top Stories
BlueData, a company that helps customers generate virtualized big data clusters, announced a $20 million Series C round and a partnership with Intel today.

The big data software market will grow nearly sixfold by 2019, according to a recent report from Ovum. The report notes that while big data software in 2015 is just a small part of the overall market for information management, it is set to increase at a compound annual growth rate (CAGR) of 50% through 2019, and play an increasingly important role that will position big data analytics as a core capability for many enterprises by 2019

One of my favorite examples of why so many big data projects fail comes from a book that was written decades before “big data” was even conceived. In Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, a race of creatures build a supercomputer to calculate the meaning of “life, the universe, and everything.” After hundreds of years of processing, the computer announces that the answer is “42.” When the beings protest, the computer calmly suggests that now they have the answer, they need to know what the actual question is — a task that requires a much bigger and more sophisticated computer.

Hortonworks today announced a definitive agreement to acquire Onyara, the company behind the data routing and streaming technology called Apache NiFi. The Hadoop heavy also announced NiFi will be the basis for its second major product line, Hortonworks DataFlow, which will underlie real-time streaming analytics and Internet of Things (IoT) applications. The National Security Agency created Apache NiFi eight years ago to address its real-time data collection needs.

Guest blog post by William Vorhies Summary: Recent surveys suggest that Big Data and Hadoop may be stalled. Why? Just a week after a report from research firm Gartner Inc. found that investment in Hadoop-based Big Data technologies "remains tentative" with poor adoption rates, a survey from Software AG identified "a Big Data paralysis" that's keeping enterprises from realizing the promised benefits of the analytics craze. This finding might come as a surprise to those of us in the Big Web User World of Data Science.

This guest blog comes to us from Samantha R. at Udemy and is a cool infographic about AWS. The original can be viewed here Originally posted on Data Science Central

CodingStuff.org is an initiative to ignite kids' enthusiasm to learn how to code, to create apps, to design websites, and overall to become comfortable with technology. This article explores what teachers can do to ignite kid's enthusiasm for coding by using interesting and cool lessons to give them some pointers on how to code and then let the magic happen!

Published Date: 2015-08-25 09:34:45 UTC Tags: Analytics, Big Data, Data Science, Data Warehousing, Predictive Analytics Title: Analytics Innovation, Issue 1 Subtitle: The first issue of our Analytics Innovation Magazine, including how Big Data can help defeat ISIS, and Barack Obama's election strategy

Doctors are highly educated and (usually) very good at certain things examining and talking to patients, and using the data they extract to diagnose and treat illness.What they are not necessarily so good at, is what they do with that data afterwards. But increasingly, innovations in health care involving big data are producing solutions which are helping physicians to diagnose more accurately and treat more effectively.

By now it's very apparent that technology has had a major impact on shaping the general healthcare industry. With smart devices sweeping everyday life, patients and medical staff capitalize on the benefits of connected technology in a multitude of ways. Particularly mobile technology advancements in healthcare have put the integration of gamification at the forefront.

Companies without an integrated security strategy for Hadoop are at risk of exposing critical data or failing an audit, but the damage to reputation could be irreparable. Learn how an integrated security strategy can secure and simplify access to Hadoop environments, while leveraging your existing identity management infrastructure. The post Identity and Access Management for Hadoop: The Cornerstone for Big Data Security appeared first on Datanami.

"Hey, Andy, check out this data I have. What's the best chart to show it?" I get asked this question a lot. My answer is always the same: It depends. This is partly because it depends on the audience, the purpose and the type of data you have. What is most important is that your choice of chart is determined by the story within the data just as much as it is by anything else. Just because you have geographical data, it doesn't mean you should make a map. Just because you have a date, it doesn't mean a trend line is the best thing. And just because you want to see the relationship of a part to a whole, it doesn't mean you should use a pie chart. Actually, you should rarely use a pie chart, but that's another, well-documented story!

Zillow, the leading real estate and rental marketplace in the USA, uses R to estimate housing values. Zillow's signature product is the Zestimate, their estimated market value for individual homes,…

Guest blog post by Bernard Marr We live in a data driven world in which we are generating, storing and analyzing more information than ever before, and at an ever-increasing speed. Traditional "relational" databases which store information in neat hierarchies of rows and columns aren't suited for the big, messy datasets harvested from video, audio and even social media data streams that are needed for today's Big Data projects. This is where noSQL databases become very handy.

So you've installed Hadoop and built a data lake to house all the bits and bytes that your organization previously discarded. So now what? If you follow the advice from industry experts, the next step on your analytics journey is to add Apache Spark to the mix. It's common for people to confuse Hadoop with analytics, says Rob Thomas, vice president of product development at IBM Analytics. "Hadoop itself doesn't do analytics," Thomas tells Datanami. "Hadoop is the data storage platform. Spark is the analytics platform.

Digium, maker of WebRTC platform Respoke, has introduced open-source SDKs for iOS and Android that aim At Making it easier to add real-time audio and video communication support to mobile apps. Furthermore, the SDK includes support for instant messaging and uses push notification in order to work even when running offline or in the background.

Growing skepticism in reaction to the endless hype about the Internet of Things has spawned yet another IoT forecast that actually embraces the hoopla by concluding that the "hype may actually understate the full potential of the Internet of Things." Not usually given to hyperbole, the staid management consultant McKinsey & Co. attributes that assertion to several key findings centered on leveraging the massive amounts of data expected to be generated by the IoT: "Most of the IoT data collected today are not used at all, and data that are used are not fully exploited," concludes a McKinsey study on the IoT released earlier this summer. For example, the study by the McKinsey Global Institute found that "only 1 percent of data from an oil rig with 30,000 sensors is examined.

No less than traditional scientists, data scientists need a guiding strategy for solving problems. Such a methodology should directly address the problem at hand and should provide a framework for obtaining answers and results. Learn more about the Foundational Methodology for Data Science and how it leads from conception to deployment, feedback and refinement.




The traditional database, known as RDBMS (Relational Database Management System), is a marvel of engineering and computer science. In one piece of software, it stores your data safely, provides a unified SQL interface to your data, and even reasons about your data in certain contexts. Yet, it carries at a foundational level assumptions from the single-machine past that are preventing enterprises from growing the capabilities of their analytics. As Web companies like Google, Amazon, and others unbundle the database, the technologies at its core are experiencing a revival and are yielding even more value for their users than their original versions, especially for analysis.

When it comes to cutting-edge technology, the annual IFA consumer electronics trade show never ceases to amaze attendees from around the globe. The IFA 2015 event held in Berlin, Germany, is expected to expand broadly from last year's presentations and announcements of connected products. Check out several great reasons to attend this huge annual consumer electronics trade show that is open to the public.

There's an old axiom that goes something like this: If you offer someone your full support and financial backing to do something different and innovative, they'll end up doing what everyone else is doing.

The effectiveness and legitimacy of government today is best demonstrated by its ability to deliver public services and accurate information to citizens when its citizens need it. This is truer than ever today, because we live in an age of unlimited information, when data anywhere can be captured, analyzed and transformed to personal insight in hours and minutes –with visual clarity. Today's new technologies open doors and opportunities for government to better serve the public good.

As I work with larger enterprise clients, a few Hadoop themes have emerged. A common one is that most companies seem to be trying to avoid the pain they experienced in the heyday of JavaEE, SOA, and .Net — as well as that terrible time when every department had to have its own portal. To this end, they're trying to centralize Hadoop, in the way that many companies attempt to do with RDBMS or storage. Although you wouldn't use Hadoop for the same stuff you'd use an RDBMS for, Hadoop has many advantages over the RDBMS in terms of manageability. The row-store RDBMS paradigm (that is, Oracle) has inherent scalability limits, so when you attempt to create one big instance or RAC cluster to serve all, you end up serving none. With Hadoop, you have more ability to pool compute resources and dish them out.

Public sector news the week of August 10, 2015 looked at how governments and organizations can use data and social media for public benefits. News highlights for the week of August 17, 2015, however, demonstrate the doubled-edged sword that is big data.

What does it take to really know your policyholders? Insurers who want to survive in the competitive marketplace need to turn to intelligent and agile insights provided by data analytics based on customer behavior, needs and preferences. With data-based insights, insurers can tackle key pain points.

Internet of Things data is transforming the world around us in a dramatic fashion, whether it affects smart buildings that optimize energy usage or farmers that use sensors to monitor animals. In the process, it's creating an avalanche of data. Managing that information and extracting meaningful insights from it is a huge task for businesses, especially the many retailers whose potential customers set foot in brick-and-mortar stores.

Data analytics are allowing disaster preparedness organizations to model emergency predictions worldwide.

Organizations that leverage advanced data analytics to innovate and be at the forefront of their industries require a database that can keep up with their needs and deliver a strong ROI. The solution that best fits this requirement is Teradata® Database 15.10, which takes performance and system efficiencies to new levels with: Support for larger memory sizes while keeping the hottest, more frequently used data in memory for optimum use. Enhanced abilities for Teradata…

Companies that have CDOs tend to perform better than those without, according to a report from Forrester.

Companies that have CDOs tend to perform better than those without, according to a report from Forrester.

Docker Inc have announced the release of Docker 1.8, which brings with it some new and updated tools in addition to new engine features. Docker Toolbox provides a packaged system aiming to be, 'the fastest way to get up and running with a Docker development environment'. The most significant change to Docker Engine is Docker Content Trust, which provides image signing and verification.

Published Date: 2015-08-24 10:58:17 UTC Tags: Analytics, Big Data, Data Science, Machine Learning, Predictive Analytics Title: The Morality Of Machine Learning Subtitle: Can data solve crime?

In this article Vijay Algarasan, a Principal Architect at Asurion, discusses how he and his teams have encountered microservices at various engagements and some lessons they have learned as a result. This has resulted in them building up a series of anti-patterns and some associated patterns, which Vijay believes are more widely applicable to all practitioners of microservices…

As the research shows, the importance of cloud data and analytics is continuing to grow. The importance of this topic makes me eager to discuss further the attitudes, requirements and future plans of organizations that use data and analytics in the cloud and to identify the best practices of those that are most proficient in it.

The founders of Sync, are also the founders of a company Netfirms in Canada, a premier hosting service there. Something that was an issue with them over time was the lack of full security with file sharing services like Dropbox, so after they sold Netfirms, they set out to make an easy to use, multi platform and fully secure, encrypted and unreadable by the hosting service file sharing system, they have realized that with Sync.

Application building no longer means learning to code. Here are seven products and services that can help you develop apps without developing programming skills.

While real-time analytics is getting more affordable, it's still not right for everything. Here are 10 ways to get the most from real time, near real time, and batch use cases.

MapR is looking to use AWS to help get its Distribution 5.0 service into the hands of more customers through the cloud.

With OnHub, Google hopes to simplify home networking, to encourage interest in IoT devices.

Gartner's annual Hype Cycle is out, and IoT and autonomous cars are in this year. Big data, however, is losing some of its luster.

Tiny computers, real-time depth sensing, and breakthrough memory technology are among the innovations featured at this year's Intel Developer Forum. CEO Brian Krzanich detailed how connected devices will change the way we all do business.

By the end of the summer, Netflix will close its last data center and move its entire streaming service to the cloud with help from AWS. It's a lesson for companies large and small.

Whether raising a round of funding or creating shareholder wealth, companies increasingly need a well-articulated and demonstrable data and analytics strategy. Here are some things that can sway an investor's opinion, for good or bad.

Google brings its own Cloud Dataflow and Cloud Pub/Sub big data services to Compute Engine and App Engine.

IBM Watson is opening its big data analytics ecosystem to new ideas and businesses, including a new way to help you win your fantasy football league.

A new analytics suite from Broadbean called BDAS looks to use big data to help HR and recruiters hire the best talent for companies.

The next time you hear the term "visionary" applied to business leadership, think about what NASA has been doing for years. Images from NASA's satellites and spacecraft show technology at its most inspiring.

Windows 7 organizations are starting to look at Windows 10. What are the major points of comparison? Read on to see what the changes could mean for your company.

IBM is planning to acquire Merge Healthcare, a company that specializes in medical imaging software. The goal is to bring enhanced imaging capabilities to the Watson Health portfolio, essentially giving the supercomputer the ability to see.

With its focus on industrial-scale machine data, GE's Predix Cloud won't look much like the data centers of major cloud suppliers such as Amazon, Microsoft, and Google.

IBM and the US Energy Department are working toward creating the next-generation of high-performance computing systems, which they call "data-centric supercomputing." This has business and national security implementations.

Hortonworks, the Hadoop system pioneer, continues to invest in the management of unstructured big data. However, the company reported a second-quarter financial loss.

With a suite of services to help companies deploy and manage Apple computers, IBM is adapting to a shift in enterprise IT.

GE, the champion of industrial-age analytics, will become a cloud service provider early in 2016 for large-scale machine data. The company envisions a machine-data-based Internet that tracks and monitors civilization's engines and systems of all types.

This entry was posted in News. Bookmark the permalink.