Big Data News – 10 Nov 2016

Featured Article
At this writing, it's Wednesday morning after the U.S. election. None of my friends is sober, probably including my editor. I had a different article scheduled originally, which it made the assumption that I'd been wrong all along, because that's what everyone said. The first article in which I mentioned President Trump posted on Sept. 10, 2015, and covered data analytics in the marijuana industry. Shockingly, both Trump and marijuana won big.

Top Stories
Cloudera Enterprise 5.9 includes the latest release of Hue (3.11), the web UI that makes Apache Hadoop easier to use. As part of Cloudera's continuing investments in user experience and productivity, Cloudera Enterprise 5.9 includes a new release of Hue. Hue continues its focus on SQL and also now makes your interaction with the Cloud easier (Amazon S3 specifically in this first version). We'll provide a summary of the main improvements in the following part of this blog post. The post New in Cloudera Enterprise 5.9: S3 integration and SQL Editor Improvements appeared first on Cloudera Engineering Blog.

We polled InformationWeek readers to find out about your IT spending priorities for 2017. Here's what you told us.

SAP is trying to make it simpler for IT to deploy an analytics environment by providing the underlying PaaS layer with SAP BusinessObjects Analytics.

Modernizing and streamlining the data center is imperative for IT departments in order to deliver the agility and flexibility required by their businesses.

Cloud storage is an indispensable tool in today's hyper-connected world. But unlike the early days of cloud storage, when vendors regularly rolled out new capabilities and routinely bumped up storage limits, the market has matured in terms of capabilities and storage norms. Here are some key factors to consider before choosing a new cloud storage service. Getting started with the cloud One of the first things you should do when choosing between cloud services is compare storage options, features and costs. Free offerings might work if you need only the basics, but some of the most important or advanced capabilities are available only via paid plans. Some cloud services offer very limited storage space for free, and some offer none at all.

Enterprises considering adopting public clouds are concerned about where their data is located and how it's protected, according to a new survey by IDG. Companies will have about 60 percent of their IT environment in public, private, or hybrid clouds, according to a survey of about 1,000 IT decision makers. Of those considering public cloud deployments, the top concerns were where data is stored, at 43 percent of respondents, and security, with 41 percent of respondents. And with all the high-profile hacks of well-known online brand names, it's no surprise.

IBM's Watson artificial intelligence platform has joined forces with researchers at MIT and Harvard to study how thousands of cancers mutate to become resistant to drug treatments that initially worked to beat back the disease. By discovering how cancers adapt to overcome drug therapies, researchers at MIT's and Harvard's Broad Institute genomics research center hope to develop a new generation of therapies that cancers cannot circumvent. While a growing number of treatments can hold cancers in check for months or years, most cancers eventually recur, according to the Broad Institute researchers. This is in part because tumors acquire mutations that make them drug resistant.

Community structures and relation patterns, and ranking them for social networks provide us with great knowledge about network. Such knowledge can be utilized for target marketing or grouping similar, yet dist…

Compared with other device markets, the tablet market will be relatively quiet in 2017, reports Taiwan-based researcher TrendForce.

What were VIP Dez Blanchfield's top three takeaways from World of Watson this year? Find out!

The 802.11ad standard has been given the catchy moniker of WiGig, and enables data speeds as fast as 8 Gigabits per second (Gbps) over short range.

Most election prediction shops and public polls in recent days foresaw Republican Donald Trump losing the U.S. presidential race to Democrat Hillary Clinton. They got it wrong, bigly. And the failed predictions could cast doubts on some hot technology sectors, including big data and customer relationship management. Not so fast, say some data experts. The problem with the polls and with forecasters like FiveThirtyEight may have more to do with data collection than data crunching, they say. Data analysis worked well in the Moneyball model for the Oakland Athletics, but baseball stats are different than election polling, said CRM analyst Denis Pombriant, founder of Beagle Research Group. Statisticians have been collecting "highly reliable" baseball data for more than a century, while polling data is more squishy, he said.

Most election prediction shops and public polls in recent days foresaw Republican Donald Trump losing the U.S. presidential race to Democrat Hillary Clinton. They got it wrong, bigly. And the failed predictions could cast doubts on some hot technology sectors, including big data and customer relationship management. Not so fast, say some data experts. The problem with the polls and with forecasters like FiveThirtyEight may have more to do with data collection than data crunching, they say. Data analysis worked well in the Moneyball model for the Oakland Athletics, but baseball stats are different than election polling, said CRM analyst Denis Pombriant, founder of Beagle Research Group. Statisticians have been collecting "highly reliable" baseball data for more than a century, while polling data is more squishy, he said.

At the Spark & Machine Learning Meetup in Brussels on October 27, 2016, Pierre Borckmans of Real Impact Analytics delivered a lightning talk called "Writing Spark applications, the easy way: How to focus on your data pipelines and forget about the rest."

The broad outlines of the transition from static, integrated data systems to distributed virtual environments seems clear.

Nick Pentreath of the Spark Technology Center teamed up with Jean-François Puget of IBM Analytics to deliver the main talk of the Spark & Machine Learning Meetup in Brussels, "Creating an end-to-end Recommender System with Apache Spark and Elasticsearch."

At the recent Spark & Machine Learning Meetup in Brussels, Sven Hafeneger of IBM delivered a lightning talk called "Hyperparameter Optimization: When scikit-learn meets PySpark."

At the recent Spark & Machine Learning Meetup in Brussels, Holden Karau of the Spark Technology Center delivered a lightning talk called "A very brief introduction to extending Spark ML for custom models."

As your IT organization and enterprise prepare technology budgets for 2017, what data and analytics topics are at the top of your priority list? Answer our Flash Poll and let us know — and find out the results so far.




Big data and design thinking share some common core principles for creating highly connected, meaningful business and customer user experiences. See why organizations worldwide are realizing the magic of combining big data with design thinking to generate value for powerful business use cases.

At the recent Spark & Machine Learning Meetup in Brussels, François Garillot of Skymind delivered a lightning talk called "DeepLearning4J and Spark: Successes and Challenges."

Microsoft is betting on artificial intelligence (AI) with the creation at the end of September of a new AI and Research Group. This newly formed group brings together Microsoft's research organization and more than 5,000 computer scientists and engineers focused on AI and is now the fourth major division in the company, on par with the Windows, Office and Cloud divisions. The job of the AI and Research Group will be to work on four overarching initiatives: Harnessing AI through agents such as Cortana, the company's digital personal assistant Infusing AI into Skype, Office 365 and every other Microsoft application Making cognitive capabilities such as vision and speech and machine analytics available to external developers Using Azure to build a powerful AI supercomputer in the cloud to provide "AI as a Service"

At the recent Spark & Machine Learning Meetup in Brussels, Koen Dejonghe of Eurocontrol delivered a lightning talk titled Simulation and processing of data streams in telecommunications.

While many smart cities are using IoT for transportation issues, there's a host of other initiatives these urban centers should start to address with the technology. Environmental and sustainability programs top a new list from Gartner.

The Upshot, FiveThirtyEight, Predictwise, etc: their predictions for President varied over the campaign as you'd expect as new data came in, but consistently made Clinton a solid favorite, with a probability of a win topping 70% the day before election day. So what went wrong? As in any statistical forecast, there are three possibilities: Source: New York Times The models were wrong. No model is perfect, but it seemed to me at least that the various forecasts, despite their differing methodologies, all captured the essential mechanisms of being elected President: the electoral college; the similar behaviours of some states; the influence of economic and demographic statistics; the relationship between polls and votes.

The framework for managing network virtualization software is now baked into the Nutanix platform.

Developers are finding a new way to create apps and programs – here's a quick look at how microservices work, and what it means for IT shops and their development operations.

Developers are finding a new way to create apps and programs – here's a quick look at how microservices work, and what it means for IT shops and their development operations.

The recent scandal at Wells Fargo could have been prevented if the right CRM system had been in place.

Modular and open source are now the watchwords for network infrastructure, whether you're delivering internet connections or VR cat videos. On Tuesday at the Structure 2016 conference in San Francisco, Facebook announced its most powerful modular data-center switch yet, and AT&T gave an update on its huge migration from dedicated servers to a software-based architecture. Once the same kind of hardware can do different things in a network, everyone gets more freedom to accomplish what needs to get done. That's true for Facebook, which built on its own switch innovations and software stack in the new Backpack switch, and for AT&T, which says enterprises can now order and turn on services in 90 seconds instead of 90 days. Agility is also the key selling point for cloud companies like Google, which hopes its customers can ignore hardware altogether in a few years.

Like Google, Microsoft has been differentiating its products by adding machine learning features. In the case of Cortana, those features are speech recognition and language parsing. In the case of Bing, speech recognition and language parsing are joined by image recognition. Google's underlying machine learning technology is TensorFlow. Microsoft's is the Cognitive Toolkit. (Insider Story)

United Nations and Qlik has announced partnership to utilize data analytics to improve efficiency and efficacy of humanitarian works across the world

It is likely that such a move will see consumers gravitate towards digital payments at a faster pace.

Earlier this year, Microsoft made a splash at its Ignite conference for IT professionals when it announced that it has been racking cards of programmable chips together with servers in its cloud data centers. The chips, called field-programmable gate arrays (FPGAs), can be reconfigured after being deployed to optimize them for particular applications such as networking and machine learning. Now, Microsoft is investing in tools that would allow customers to program the FPGAs, said Scott Guthrie, the executive vice president in charge of Microsoft's cloud and enterprise division, during a talk at the Structure conference in San Francisco.

Modular and open-source are now the watchwords for network infrastructure, whether you're delivering internet connections or VR cat videos. On Tuesday at the Structure 2016 conference in San Francisco, Facebook announced its most powerful modular data-center switch yet, and AT&T gave an update on its huge migration from dedicated servers to a software-based architecture. Once the same kind of hardware can do different things in a network, everyone gets more freedom to accomplish what needs to get done. That's true for Facebook, which built on its own switch innovations and software stack in the new Backpack switch, and for AT&T, which says enterprises can now order and turn on services in 90 seconds instead of 90 days. Agility is also the key selling point for cloud companies like Google, which hopes its customers can ignore hardware altogether in a few years.

IBM's Mac Devine spoke at the 2016 Structure Conference on how IoT, big data, and cognitive computing are changing the way that enterprises are approaching their infrastructure.

Tableau is moving into the data-wrangling business, announcing plans for visual data-preparation software code-named Project Maestro. The idea is to bring the same sort of "self-service" visualization to the prepping and cleaning of data as they've built for data analysis, Dan Jewett, Tableau vice president of product management, told Tableau's user conference this morning. "Maestro is going to make data preparation a breeze." The software is expected to be available "later next year." In a brief demo, Jewett showed visual ways of inspecting, joining and editing data. Results could then be piped into Tableau for analysis.

Google infrastructure czar Urs Hölzle is focused on a cloud future where customers don't think about the infrastructure underlying all of the workloads they're running. In his view, one of the key advantages of the cloud is that customers can get the benefits of new hardware without having to completely rework their software. "So that means you can have a million customers who move to that new hardware platform, not knowing they did," he said Tuesday at the Structure Conference in San Francisco. "Which means that you can really insert this new technology in a much faster cycle than you could if you did the same thing on-premises." That means companies can get quick, seamless improvements to performance, as opposed to an on-premises deployment. When operating their own data centers, companies must take the time to evaluate new hardware, and take the time to roll it out.

Artificial intelligence is as potent in malware battle as in other endeavors where data must be assessed, trends deduced, plans of action put in play.

The Senseable City Lab and the transportation company Uber are launching a research initiative to explore how car- and ride-sharing networks could reshape the future of urban mobility. This initiative will explore new mobility paradigms for the 21st century, building on both parties' data and analytics strengths. "We all know how the 'sharing economy' has revolutionized many aspects of our lives. How will it challenge traditional notions of mobility and individual freedom after the advent of self-driving?" asks Carlo Ratti, professor of the practice of urban technologies and director of the Senseable City Lab. "In the United States, cars are idle 95 percent of the time, so they are an ideal candidate for the sharing economy."

This entry was posted in News and tagged , , , , , . Bookmark the permalink.