Big Data News – 3 Mar 2016

Today's Infographic Link: Glass Half Empty: The Coming Water Wars

Top Stories
AtScale, which offers software to enable traditional business intelligence tools to query the big data stored in Hadoop, has released a new SQL-to-Hadoop benchmark that shows which engines work best for which tasks.

Microsoft has updated the Data Science Virtual Machine, a data science toolkit-in-a-box that you can easily spin up on the Microsoft Azure cloud service. The virtual machine now comes pre-configured with Microsoft R Server Developer Edition (upgraded from Microsoft R Open), Anaconda Python, Jupyter notebooks for Python and R, Visual Studio Community Edition, Power BI desktop, and SQL Server Express edition.  For R users, this is a great way to harness some powerful hardware for heavy-duty computations with R. Microsoft R Server includes RevoScaleR, the R package that allows you to build statistical models on data larger than available RAM.

I recently connected with the CEO of a fascinating company based in Cork, Ireland, called Treemetrics. Its mission is to eliminate waste in the logging industry by moving beyond antiquated forest survey methods and other problems caused by a lack of good information. By applying data analytics technology to this sector, the company claims to be able to reduce wastage by up to 20%. To make this work, Treemetrics has to extract data from exotic sensing systems such as 3D laser mapping to obtain stem volume and taper information, which tells a lot about the size, type and health of tree groupings. It's pretty cool and leaps beyond where the forestry industry has been for literally hundreds of years in terms of harvesting efficiency.

New services make use of Kaspersky Lab security software to monitor threat levels and even predict when and where attacks will occur.

Learn more about how digital assistants including Amazon Alexa, Facebook M, Google Now, and Apple's Siri are rewriting the rules around data privacy and sharing.

Software stability is a key requirement for large companies, and Hortonworks is taking a big step in that direction with a new release cadence for its enterprise Hadoop software. Starting with the Hortonworks Data Platform 2.4, which is now available, the company is taking a two-pronged approach to updates. Specifically, it will update core Apache Hadoop components such as HDFS, MapReduce and YARN along with Apache Zookeeper annually, while extended services that run on top of those components — including Spark, Hive, HBase and Ambari — will be updated continuously throughout the year.

Facebook VP held in Brazil by police, who say WhatsApp failed to follow a court order. The judge's demand told Facebook to release information about drug-probe suspects. Facebook, owners of WhatsApp, says it doesn't have the information that the police want. And it thunders that the arrest of Diego Dzodan is "disproportionate." In IT Blogwatch, bloggers break out the popcorn. Not to mention: Dodge & Fuski… Your humble blogwatcher curated these bloggy bits for your entertainment. [Developing story: Updated 2:11 am PT with more comment]

News: Software will offer integration with Google, PayPal and UPS.

[Cross-posted from blog.davemdavis.net] Yesterday Microsoft announced the availability of the HoloLens Developer Kits.  The kits are available to developers who registered and were approved to get them. They will be shipped in waves with wave 1 shipping on March 30th.  These kits are going for $3000 and are only available in the US and Canada.

Facebook VP arrested in Brazil, for allegedly failing to comply with a court order. The order commands WhatsApp to hand over messages relating to drug investigation suspects. WhatsApp, owned by Facebook, says it doesn't keep copies of messages that its users send and receive. And Facebook says the arrest of Diego Dzodan is "disproportionate." In IT Blogwatch, bloggers break out the popcorn. Not to mention: Dodge & Fuski… Your humble blogwatcher curated these bloggy bits for your entertainment.

Software stability is a key requirement for large companies, and Hortonworks is taking a big step in that direction with a new release cadence for its enterprise Hadoop software. Starting with the Hortonworks Data Platform 2.4, which is now available, the company is taking a two-pronged approach to updates. Specifically, it will update core Apache Hadoop components such as HDFS, MapReduce and YARN along with Apache Zookeeper annually, while extended services that run on top of those components — including Spark, Hive, HBase and Ambari — will be updated continuously throughout the year.

A new AAA survey shows just how little people trust autonomous vehicles, and a Google self-driving car just had an accident.

Software stability is a key requirement for large companies, and Hortonworks is taking a big step in that direction with a new release cadence for its enterprise Hadoop software. Starting with the Hortonworks Data Platform 2.4, which is now available, the company is taking a two-pronged approach to updates. Specifically, it will update core Apache Hadoop components such as HDFS, MapReduce and YARN along with Apache Zookeeper annually, while extended services that run on top of those components — including Spark, Hive, HBase and Ambari — will be updated continuously throughout the year.

Security vendors have released a multitude of new and innovative products to help organization better defend high-value assets from cyber threats.

The bi-modal term was coined by Gartner about five years ago to describe the way in which the enterprise should guide IT spend over the coming decade.

What can you do when simply being "digital" isn't enough? Find out how you can make use of existing resources to create a place for yourself at the head of the pack, differentiating yourself digitally to customers in your market.

What can you do when simply being "digital" isn't enough? Find out how you can make use of existing resources to create a place for yourself at the head of the pack, differentiating yourself digitally to customers in your market.

Big Data has plenty of hype, but when it gets to the details, sometimes data scientists are ignored.

Keeping up with millennials' shifting social media use is a daunting task for consumer product companies. Instead, modern marketers may need to emphasize values that are important to this cohort across digital channels.

Keeping up with millennials' shifting social media use is a daunting task for consumer product companies. Instead, modern marketers may need to emphasize values that are important to this cohort across digital channels.

SAN DIEGO — Cisco this week is throwing its hat into the hyperconvergence and software-defined storage ring with a system co-developed with software company SpringPath. Cisco is also rolling out at its Cisco Partner Summit here a new generation of Nexus 9000 data center switches featuring 25G/50G Ethernet based on custom ASICs. The new products dovetail with Cisco's acquisition today of CliQr, a maker of "application-defined" hybrid cloud orchestration software for deploying and managing applications across bare metal, virtualized and container environments.

When looking to improve customer satisfaction, most telecommunications providers immediately think of streamlining the call center, reducing outages and increasing technician success. Billing is typically an afterthought and not linked to customer satisfaction, but organizations that don't improve the billing process are missing a significant opportunity for happier customers.

For the transportation industry, predicting operational failures before they occur cuts down on the high costs that come with lost productivity and limited performance. Consider the ways technology is making businesses better able to take a proactive role before breakdowns disrupt critical infrastructure.

For the transportation industry, predicting operational failures before they occur cuts down on the high costs that come with lost productivity and limited performance. Consider the ways technology is making businesses better able to take a proactive role before breakdowns disrupt critical infrastructure.

HPE Security is looking to improve the security of mobile devices and the enterprise overall with two new security offerings announced in conjunction with the RSA Conference. HPE also released its Cyber Risk Report 2016.

I recently had a chat with Benjamin Bengfort, a data scientist finishing his PhD at the University of Maryland, and Jenny Kim, a software engineer at Cloudera, about their forthcoming O'Reilly Media book (now in Early Access), Data Analytics with Hadoop: An Introduction for Data Scientists. Why did you decide to write this book? Ben: The content was originally part of a class that Jenny and I were teaching together. The post Meet the Authors: "Data Analytics with Hadoop" from O'Reilly Media appeared first on Cloudera Engineering Blog.

Microsoft and Raspberry Pi want people to build businesses and start Kickstarter campaigns around cool devices making use of the new Raspberry Pi 3 computer. The companies are teaming up to provide an entire package needed to build Internet of Things and smart devices, including hardware, OS and cloud services. The goal is to help Raspberry Pi 3 users take their envisioned devices from concepts to the end market. Microsoft is previewing a new edition of its Windows 10 IoT Core operating system for the Raspberry Pi 3. With the OS update, Microsoft is making it easier to customize the OS to a specific device made using Pi 3.

Microsoft and Raspberry Pi want people to build businesses and start Kickstarter campaigns around cool devices making use of the new Raspberry Pi 3 computer. The companies are teaming up to provide an entire package needed to build Internet of Things and smart devices, including hardware, OS and cloud services. The goal is to help Raspberry Pi 3 users take their envisioned devices from concepts to the end market. Microsoft is previewing a new edition of its Windows 10 IoT Core operating system for the Raspberry Pi 3. With the OS update, Microsoft is making it easier to customize the OS to a specific device made using Pi 3.

by Verena Haunschmid Since I have a cat tracker, I wanted to do some analysis of the behavior of my cats. I have shown how to do some of these things here.  Data Collection The data was collected using the Tractive GPS Pet Tracker over a period of about one year from January 2014 to November 2014 (with breaks). From March to November I additionally took notes in an Excel sheet which cat was carrying the tracker.

News: Does the industry need to assess the risk from tech players beating them at their own game?

Does data extortion or ransom that targets the healthcare industry require different approaches to data breaches and protection than other industries? Despite opinions to the contrary, no cybersecurity crises are specific to the healthcare industry. Its organizations can use the same data protection practices used in other industries, such as the Cybersecurity Framework from the National Institute of Standards and Technology and Continuous Diagnostics and Mitigation from the US Department of Homeland Security.

The human mind can't keep up with algorithms that tap vast data sets. The latest example is Google's convolutional neural network called PlaNet that can identify where photos were taken based on the pixels in the image.

Backing up and archiving data on the Oracle public cloud would cost roughly one-tenth of a penny per gigabyte per month using Oracle software.

Apprenda has existed for many years. The vendor offers a platform as a service (PaaS) geared toward large enterprises software development needs. When it was first created, Apprenda was a pure-play .NET PaaS and told a story of the huge number of enterprises that were built upon .NET and needed a way to make their software development process faster and more efficient. For whatever reason, that story didn't quite jell, and a few years ago Apprenda had a major shift from the ".NET pure play" story and became a "double pure-play vendor." Its platform was suddenly both a .NET specialist and a Java specialist resource.

Cisco this week announced its intention to acquire CliQr Technologies, a privately-held developer of "application-defined" cloud orchestration software for $260 million. CliQr's software allows cloud operators to deploy and manage applications across bare metal, virtualized and container environments. Cisco says the acquisition will help its customers configure and operate private, public and hybrid cloud deployments. CliQr is already integrated with a number of Cisco's data center switching and cloud offerings, including Application Centric Infrastructure and Unified Computing System. Moving forward, Cisco says it will continue to integrate CliQr across its data center portfolio.

Increased usage of encryption as a default mechanism for securing data in a mobile world is more than likely about to become a new norm for all.

What's the secret of those organizations that have gotten sales force transformation right?

Ever since Edward Snowden revealed the extent of surveillance that occurs over the Internet, there has been much attention given to encryption. Indeed, the current storm over the FBI's desire to gain access to private individuals' mobile devices would be rendered irrelevant if widespread encryption was the norm. Encryption already tends to be more common in the enterprise, but with encryption comes some questions as to what should happen with the keys that allow those files to be decrypted. Egnyte wants to answer all permutations of that question today with a deep encryption offering.

Last decade, Apple took the Smartphone market from Blackberry. This month it looks like the FBI may be inadvertently helping Blackberry take it back.

If you're looking to implement a big data project, you're probably deciding whether to go with Apache Spark SQL or Apache Drill. This article can help you decide which query tool you should use for the kinds of projects you're working on.

NTT Communications is adopting OpenStack in its public cloud, and introducing a bare-metal option to its hosted private cloud offering. It is also expanding connectivity and management tools for hybrid cloud environments. Japan will see the new Enterprise Cloud features first, followed later this year by the U.S., Germany, the U.K., Australia, Singapore and Hong Kong, the company said Tuesday. NTT Communications is the international networking and IT services subsidiary of Japanese telecommunications giant NTT, and Enterprise Cloud is the brand it uses for a variety of hosted services.

Microsoft has lifted the lid on a new security service designed to identify and respond to "advanced" attacks on companies' networks. With Windows Defender Advanced Threat Protection, Microsoft is looking to further bolster the security credentials of Windows 10 — its latest operating system and one that the company has long touted for its protection against online attacks.

NTT Communications is adopting OpenStack in its public cloud, and introducing a bare-metal option to its hosted private cloud offering. It is also expanding connectivity and management tools for hybrid cloud environments. Japan will see the new Enterprise Cloud features first, followed later this year by the U.S., Germany, the U.K., Australia, Singapore and Hong Kong, the company said Tuesday. NTT Communications is the international networking and IT services subsidiary of Japanese telecommunications giant NTT, and Enterprise Cloud is the brand it uses for a variety of hosted services.

News: Government looks to improve citizens lives through data sharing.

News: IoT & big data analytics expected to add ?322bn to the UK economy by 2020.

Update/Correction: This post was updated at 5:01 a.m. Pacific 3/1/2016 to reflect current search data. The headline of an earlier version of this post was corrected to remove suggestion that Google and Bing showed similar predictions for Clinton. Today is Super Tuesday, when 12 states hold their primary elections and caucuses in the U.S. presidential campaign. It is arguably the most important day for candidates in the battle for the Democratic and Republican nominations, a day when the most states and the most delegates are up for grabs. It's also a day when the wisdom of the crowds, a hallmark of American democracy, is put to the test. And while entering a search query is quite different from casting a vote, a look at search trends on Bing Search Wave and Google Trends offers a powerful indicator of people's support for the candidates. Call it the curiosity of the crowds. As of this writing (on the morning of Super Tuesday), both Bing and Google agree: Based on search volume, Trump will win all 11 Republican contests. However, they disagree on the Democratic contests, with Bing showing Sanders ahead 6 states to Clinton's 5 states, and Google showing Sanders ahead in 9 of the 11 contested states. (The discrepancy between 11 contests across 12 states is due to Alaska holding its Democratic contest on March 26 and Alaska holding its Republican contest on March 29.) How the Republican candidates rank on Bing Search Wave: How the Democratic candidates rank on Bing Search Wave: How the Republican candidates rank on Google Trends: For the Democratic candidates, Google Trends is ranking them on a state-by-state basis. For example, Texas: For updated information on candidate search queries, visit Bing Search Wave and Google Trends throughout Super Tuesday.

Deep learning has usually been accessible to only the largest organizations, but that's starting to change. On Monday, an AI startup called Nervana launched a cloud offering for what it calls deep learning on demand. Nervana Cloud is a hosted platform designed to give organizations of all sizes the ability to quickly build and deploy deep-learning tools without having to invest in infrastructure equipment or a large team of experts. Based on neon, Nervana's open-source deep-learning framework, the full-stack offering is optimized to handle complex machine-learning problems at scale.

Deep learning has usually been accessible to only the largest organizations, but that's starting to change. On Monday, an AI startup called Nervana launched a cloud offering for what it calls deep learning on demand. Nervana Cloud is a hosted platform designed to give organizations of all sizes the ability to quickly build and deploy deep-learning tools without having to invest in infrastructure equipment or a large team of experts. Based on neon, Nervana's open-source deep-learning framework, the full-stack offering is optimized to handle complex machine-learning problems at scale.

Deep learning has usually been accessible to only the largest organizations, but that's starting to change. On Monday, an AI startup called Nervana launched a cloud offering for what it calls deep learning on demand. Nervana Cloud is a hosted platform designed to give organizations of all sizes the ability to quickly build and deploy deep-learning tools without having to invest in infrastructure equipment or a large team of experts. Based on neon, Nervana's open-source deep-learning framework, the full-stack offering is optimized to handle complex machine-learning problems at scale.

Deep learning has usually been accessible to only the largest organizations, but that's starting to change. On Monday, an AI startup called Nervana launched a cloud offering for what it calls deep learning on demand. Nervana Cloud is a hosted platform designed to give organizations of all sizes the ability to quickly build and deploy deep-learning tools without having to invest in infrastructure equipment or a large team of experts. Based on neon, Nervana's open-source deep-learning framework, the full-stack offering is optimized to handle complex machine-learning problems at scale.

Caleb Barlow, vice president of IBM Security, says Resilient Systems extends IBM's security portfolio to cover protecting and detecting threats.




This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach. Cloud storage revenue is forecast to grow more than 28% annually to reach $65 billion in 2020.  The driving force is the substantial economies of scale that enable cloud-based solutions to deliver more cost-effective primary and backup storage than on-premises systems can ever hope to achieve. Most IT departments quickly discover, however, that there are significant challenges involved in migrating and synchronizing many thousands or even millions of files from on-premise storage systems to what Gartner characterizes as Enterprise File Synchronization and Sharing (EFSS) services in the cloud. According to Gartner, "by 2019 75% of enterprises will have deployed multiple EFSS capabilities, and over 50% … will struggle with problems of data migration, up from 10% today."

Should we share farmers' big data security concerns about capture, ownership, and usage? To answer that question, learn why farmers are worried.

Disaster Recovery as a Service is one option organizations should consider, especially if full-time, on-site DR staff is not an option.

With millions sold worldwide, Raspberry Pi has fueled the maker culture and has become a symbol for creativity and innovation for the Internet of Things (IoT). The IBM Watson IoT Platform is the world's most powerful Cognitive IoT Platform, and it supports the newest Raspberry Pi out of the box.

With millions sold worldwide, Raspberry Pi has fueled the maker culture and has become a symbol for creativity and innovation for the Internet of Things (IoT). The IBM Watson IoT Platform is the world's most powerful Cognitive IoT Platform, and it supports the newest Raspberry Pi out of the box.

One of benefit of integrating R with PowerBI is access to rich array of data visulizations not present in the standard PowerBI loadout. R is practically unlimited in the types of graphics it can create (although the amount of programming required can vary from a few lines using an existing R package, to large custom functions for truly bespoke graphics). Some of the visualizations you can create with R include population pyramids, small multiples, annotated time series, calendar heat maps, rank plots and even emoji charts.

Strategic information analysis is one of the most important activities that your company can perform. The fruits of this labor, especially when consolidated throughout the organization, inform everything from marketing and innovation, to risk management activities. To achieve this level of performance, you'll need more than a simple monitoring or keyword recognition tool; you need software that reads and understands like an analyst.

Don't sound the death knell for brick and mortar retail just yet. Brands get a second life by teasing consumers with unique in-store experiences that marry digital, mobile and virtual components.

Information governance is meant to help mitigate this risk, but the industry is not going to pause while a company puts its strategy in place.

Expect cars to become more attractive targets for hackers, just like every other IoT device.

The idea of workforces engaging via graphic simulation is unnerving, and fraught with security, privacy, reliability and common courtesy issues.

Three years ago Hortonworks led a chorus of open source Kumbaya as it sought to differentiate itself in the rapidly growing Hadoop market. Today, Hortonworks has significantly changed its tune, embracing proprietary software as a way to improve its financials. The only shocker here is that it took so long. Last month I declared that "there's no money in open source." While open source has become essential for every business, selling open-source software is an exercise in self-hurt and business model gymnastics. Hence, the most successful open source companies aren't open source companies at all: They're companies like Facebook that liberally embrace and contribute open-source code, but don't sell it.

Esri released Drone2Map for ArcGIS, an app that extracts data from still images taken by drones to create 2D and 3D visualizations for land analysis, infrastructure inspection, and event monitoring of natural disasters and environmental changes.

EY, formerly known as Ernst & Young, released a new sensor data survey focused on insurers. They found the insurance sector vulnerable to disruption since it still primarily relies on customer self-reporting and is lagging behind in using new data sources, such as from wearables, telematics and other sensors. While insurers who move first to using new data sources can, according to EY, become the industry disruptors, they'll have to be innovative about how they get the data because most "can't build a fully functional data ecosystem on their own."

President Obama's Precision Medicine Initiative is an effort to increase collaboration between precision medicine researchers, academia, government and industry leaders. Now Cloudera has joined that effort.

The job board uses AI analytics to match job seekers with jobs in data science, analytics, tech and "everything big data." The difference in this and traditional job boards is that there is no need for applicants to search for jobs using keywords. Instead, the applicant fills out a profile and the AI instantly identifies job opportunities based on skill set, experience and preferences filters.

But the digitization of airports did not end with increased security and customer service systems and analysis. It also includes the digitalizing of operations and processes as well as advanced geographic information systems aimed at maximizing physical plant improvements, air traffic flow and sustainability.

In what is perhaps the most contentious U.S. presidential election, when all previous campaign models appear broken, data-driven journalism becomes even more crucial to correctly interpreting, assessing and analyzing current events for the voting public. To aid in the endeavor, Reuters and SAP have announced a partnership to supercharge Reuters' data journalism for U.S. presidential election coverage.

The only way to effectively sort through and prioritize potential threats is to enable as much collaboration as possible among security analysts.

Check out more of our 20th anniversary of Pokemon coverage this week at GamesBeat. Microsoft today announced that it's starting to send developers invitations to preorder the $3,000 HoloLens Development Edition version of its HoloLens augmented reality headset. The kits will ship to people in the U.S. and Canada starting on March 30. Microsoft had said pre-orders would begin last month. But still, the preorder milestone is coming just over a year after Microsoft first unveiled the HoloLens during an event at company headquarters. Microsoft has provided early models to people at NASA, Case Western Reserve University, and Cleveland Clinic.

White box switching company Pica8 this week enhanced its operating system software to overcome limitations in OpenFlow switching. Pica8 is adding Table Type Patterns (TTP) to PicOS so it can scale to 2 million flows with Cavium's XPliant switch ASIC, and to 256,000 flows with Broadcom's StrataXGS Tomahawk switch ASIC. This will enable larger data center build-outs, Pica8 says, because typical TCAM flow capacity in the top-of-rack installed base today is between 1,000 and 2,000 flows. +MORE ON NETWORK WORLD: Crossroads for OpenFlow?+

We sat down with VMware CEO Pat Gelsinger during the 2016 Mobile World Congress to learn more about the company's strategic partnership with IBM. Gelsinger also opened up about how the Dell-EMC deal has been affecting VMware's business, and shared an update on partner relationships.

We sat down with VMware CEO Pat Gelsinger during the 2016 Mobile World Congress to learn more about the company's strategic partnership with IBM. Gelsinger also opened up about how the Dell-EMC deal has been affecting VMware's business, and shared an update on partner relationships.

While MapReduce has been the mainstay of Hadoop processing, Apache Spark is now taking the throne as the way to handle distributed computation. The reasons are obvious: Spark is very fast due to its use of Resilient Distributed Datasets, or RDDs, and it has a clean programming model.

ImageAlthough every purchase that a business makes has consequences, the majority of such procurements are refundable and therefore reversible to a point if the negative outcomes outweigh the positives. Staplers, copy machines, and computer hardware equipment can all be taken back to the store with little fear of leftover penalties. Unfortunately, some modern technologies — especially the newer, less tested ones like the Internet of Things (IoT) and data analytics tools — come with far more risks than the typical coffee machine upgrade for the break room.

Nearly 20 years after its launch, Internet2 is quietly humming along on university campuses across the country, doing its R&D work and connecting researchers who might otherwise not be able to share information so readily.

News: Agreement between web giant and the UK's fourth biggest grocery is signed while disputed contract with tech platform supplier Ocado is amended

This entry was posted in News and tagged , , , , , , , , , , , , . Bookmark the permalink.