Big Data News – 25 Apr 2016

Top Stories
GE is investing heavily in analytics to better manage the lifecycle of industrial equipment.

In March 2000, Avanade was created as a joint venture of Microsoft and Accenture to help companies build the client-server architectures that have powered IT for years. While the Seattle-based company still handles bread-and-butter infrastructure integration, today it is squarely focused on helping clients move to the cloud and engineer their digital transformation initiatives.

Jare.io, touted as a free Content Delivery Network (CDN), is essentially a wrapper over Amazon's CloudFront. By Hrishikesh Barua

Microsoft Power BI provides data insights for executives, line-of-business workers and data analysts alike. Here's an overview of what it can do for each type of user.

News: Microsoft Azure IoT Suite and Cortana Intelligence Suite will be integrated into Rolls-Royce's service solutions.

Big data has challenges but it's clear that the pros of utilizing it now far outweigh the cons. Here are just a few ways that big data already helps solve social challenges.

In the sense that all science is a liberal art, yes.This week I read a great piece by Adam Weinberg, the president of Denison University, on his school's initiatives to apply a liberal arts approach to data science.This is shocking to most, but why? Let's diagram this phrase "data science" (English skills!). Data is information — let's leave it at that for now. But what about science?

Above the Trend Line: machine learning industry rumor central, is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items such as people movements, funding news, financial results, industry alignments, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

In this special guest feature, Gary Baum, Vice President of Marketing at MyScript, talks about how handwriting recognition is enhancing machine (and human) learning. As an input method, handwriting recognition teaches machines to adapt to the user, adding in another layer to their evolving skill set. Those users can program systems simply by jotting down notes and in turn, build platforms most reflective of the human experience.

Fortinet is embarking on a mission for its firewalls and other products and those of third-party vendors to work together to boost security across core networks, remote devices and the cloud.

This week marks the annual North American OpenStack Summit, this year being held back where it all began, in Austin, Texas. When OpenStack began, half a dozen or so years ago, it was focused on giving service providers and large enterprises the ability to compete with Amazon Web Services (AWS). Using the open source OpenStack operating system, organizations could build themselves an "Amazon-like" cloud of their own.

When choosing performance evaluation software, many HR organizations overlook the cultural element.

Don't Start with the Data Do Start with a Good Question Don't think one person can do it all Do build a well-rounded team Don't only use one tool… [ED: A short but sweet advice list: Definitely click through… this is brilliant!]

A report by the Cloud Security Alliance finds that identity management tools and processes are key to ameliorating the threat of breaches. The report reveals which tools are most popular, and which are underutilized.

Naomi Robbins, author of Creating More Effective Graphs and Forbes contributor has teamed up with daughter Dr Joyce Robbins to present a new webinar this Thursday April 28, Creating Effective Graphs with Microsoft R Open.

The Data Lab has partnered with New York's globally renowned The Data Incubator, to develop the three-week Data Boot Camp as part of a drive to plug the nation's data skills gap. Brian Hills, Head of Data at The Data Lab, said: "The skills needed to exploit the data opportunity are in great demand and short supply. With the average cost of recruiting a data scientist around ?10,000, we've come up with a pioneering solution with The Data Incubator that will upskill existing talent and allow organisations to benefit quickly. Boot Camps get results.

People with big data and data science skills are some of the most sought after professionals because demand is outstripping supply. Here are 10 books that can help you learn everything about the emerging field and the tools you will need to conquer it.

Marketing technology is undergoing a revolution, largely driven by data. Because the growth of data and its ensuing complexity are challenging today's businesses trying to make sense of it, the MarTech community is living in very exciting times. But to really understand why marketing technology is important, I think you need to have the full context. Across all industries, every business is feeling the impact of data. Because of its explosion and the relatively new accessibility to capture, process, analyze and act on it, today's cutting-edge businesses are able to do things once thought impossible.

Wi-Fi hardware vendors are coming out with new cloud-based solutions, primarily to ease the remote management of wireless networks. However, they typically only support their own hardware. Here we take a look at three cloud-based solutions that support wireless routers and access points from multiple vendors.

Our roundup of intriguing new products from companies such as Ctera and Ipswitch.

If you're buying a home or looking for an apartment, most likely Zillow.com will come to mind first, which is a branding triumph for a website that launched just 10 years ago. Today the Zillow Group is a public company with $645 million in revenue that also operates websites for mortgage and real estate professionals — and completed the acquisition of its nearest competitor, Trulia, last year. From the  start, Zillow offered the "Zestimate," its value-forecasting feature for homes in locations across the United States.

The latest bi-annual survey data of OpenStack users shows a continuing march of the open source cloud software into mainstream of enterprises, but also the project's continued challenges related to ease of deployment and management. +MORE AT NETWORK WORLD: Cool products at OpenStack Austin Summit + One thing that's clear is that interest in OpenStack continues to grow rapidly.

OpenStack has cemented itself as the dominant open source IaaS platform. But at the same time, more proprietary offerings from vendors like Amazon Web Services, Microsoft Azure and VMware reign in the market. OpenStack Foundation Director Jonathan Bryce makes the case of why open source cloud should be the foundation of your data center.

Their need to increase their visibility, which was once a simple problem solved by basic marketing tools, had to transform into to an invasive game of imposing their information on the market. While it may seem that there is not such a thing as bad advertisement, using the same marketing techniques over and over is a big turn off for consumers. The result of intrusive publicity is generating a wave of antipathy from the client-target, and in the best case scenario, they will only watch your content like another 'terms and conditions' they will need to scroll down and accept it.

We should be excited that Apache Hive community have released the largest release and announced the availability of Apache Hive 2.0.0. It brings great and exciting improvements in the category of new functionality, Performance, Optimizations, Security, and Usability. Let us explore the features in detail below; HBase to store Hive Metadata — The current metastore implementation is slow when tables have thousands or more partitions. With Tez and Spark engines we are pushing Hive to a point where queries only take a few seconds to run. .

The book achieving impact through engagement by Si Alhir and and Peter L. Simon explores two models on employee and customer engagement: The Ownership Pyramid (TOP) and Artful Agility or Actions-Intentions-Results (AIR). Together these models can be used to achieve impact in organizations based on increasing engagement.

Andrea Magnorsky discusses active patterns, computation expressions, parsers, using type providers and more. These language features help make code simpler and easier to maintain. By Andrea Magnorsky

A "Port to Android" pull request that has been recently merged into the official Swift repository master branch makes it possible to create simple programs for Android. The pull request added an Android target for Swift stdlib and allows developers to use a Linux environment to cross-compile for Android on the ARMv7 processor.

Monica Beckwith talks about G1 pause (young and mixed) composition, G1's remembered sets and collection set and G1's concurrent marking algorithm. She also covers performance tuning advice for "taming" mixed collections and evacuation failures. By Monica Beckwith

Mathieu Bastian explores the mechanics of testing large, complex data workflows and tries to identify the most common challenges developers face. He looks at good practices how to develop unit, integration, data and performance testing for data workflows. In terms of tools, he looks at what exists today for Hadoop, Pig and Spark with code examples. By Mathieu Bastian

Thanks to today's data management platforms (DMPs), brands are becoming more sophisticated in how they collect, store and analyze data in order to improve their customer experience. From Adobe AudienceManager and Core Audience to Oracle-acquired BlueKai and X+1, a growing number of DMP software options with a range of specializations are now available. Wal-Mart was one of the first companies to experiment with data warehousing of transactional data before the aforementioned tools existed. Since then, what companies are doing with the help of well-integrated data and data management platforms has become increasingly innovative.

Last year, readers thoroughly enjoyed our run-down of the top big data books for the year. As you prepare for summer vacation, you'll probably want to pack a bag with the best big data books, so that when you return you aren't just refreshed — you're stocked up on the latest information on your career field and newly rejuvenated in your effort to leverage the latest technologies to push your business to new heights. Here are the top books to beef up on big data in 2016.

Zoomdata, developers of the high-performance visual analytics solution for big data, launched the Zoomdata Developer Network (ZDN). ZDN is designed for developers who want to embed Zoomdata's advanced data visualization capabilities into their applications to meet the need for data-driven solutions.

The Core Protocols are a set of ideas identified by Jim and Michelle McCarthy. Richard Kasperowski will open the second day of the Agile Games Conference with an explanation of how to use these protocols to help a team transform to greatness. He spoke to InfoQ about how this happens and how they relate to other team formation models. By Stephane Wojewoda

BlueData, provider of a leading infrastructure software platform for Big Data, announced the spring release for the BlueData EPIC software platform. With this new product release, available immediately, BlueData introduces several new enhancements to provide enterprise-class security and quality of service (QoS) for multi-tenant Big Data deployments.

At SAS Global Forum, the company introduced Viya, a platform that it said is designed to stitch together all types of data, wherever it resides. We've got that and other announcements from the forum in our Big Data Roundup for the week ending April 24, 2016.

Martin Thompson, co-founder of LMAX, keynoted at QCon Sao Paulo 2016, outlining the top 10 performance related mistakes that he has encountered in production.

Our friends over at Springboard just released a compelling new infographic that highlights the different roles within data science along with the different skill sets required for them.

Informatica, a leading independent software provider focused on delivering transformative innovation for the future of all things data, announced a new end-to-end solution to turn big data into trusted data assets for faster and more sustainable business value.

Few years from now your credit cards, bus pass, train tickets, loyalty cards for high street coffee shops will be gone due to digital transformation and you only carry your phone. Welcome to Near…

Mark Price explores the life cycle of Java code, and how the JVM evolves the runtime representation of code during program execution. From bytecode to assembly and back again (via some uncommon traps), he covers practical tips on ensuring that Java code is running fast when it matters most. By Mark Price




Axel Fontaine looks at what Immutable Infrastructure is and how it affects scaling, logging, sessions, configuration, service discovery and more. He also looks at how containers and machine images compare and why some things people took for granted may not be necessary anymore. By Axel Fontaine

Sarah Lake Hagan talks about Nordstrom Technology new culture code: NorDNA. Hagan starts off by explaining what NorDNA is, how they introduced this concept to their teams, and shows concrete examples of how their employees have embraced this culture to create some inspiring products. By Sarah Lake Hagan

On April 19th, 2016 Amazon announced changes to their Elastic Beanstalk service. In this update, Amazon is providing customers with the ability to automatically install platform updates.

Adam Miskiewicz goes beyond the React Native docs and talks about how to build performant and production-ready React Native applications. Miskiewicz discusses using React (web) best practices in a native environment (with Redux, Relay, and GraphQL); why you shouldn't be scared of bridging native code; how to plan ahead for easy, cross-platform React Native development; and much more. By Adam Miskiewicz

Long before the Internet of Things (IoT) became trendy, analytics leader SAS was probing data from sensors and other devices. SAS® Analytics for IoT is a new package of proven software products that applies SAS' core expertise of analyzing massive amounts of data to IoT connected sensors and devices.

Ben Snively is a Solutions Architect with AWS Jon Fritz, a Senior Product Manager for Amazon EMR, co-authored this post With today's launch of Amazon EMR release 4.6, you can now quickly and easily provision a cluster with Apache HBase 1.2. Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is an open-source, non-relational, versioned database which runs on top of the Hadoop Distributed Filesystem (HDFS), and it is built for random, strictly consistent realtime access for tables with billions of rows and millions of columns.

Join and RSVP! AWS Speaker Guy Ernest, business development manager for machine learning services in AWS "No Dr., or How I Learned to Stop Debugging and Love the Robot" In this talk, Guy will dicuss what developers must know to explore the power of machine learning services in the cloud. Using data to build machine learning models is a powerful alternative for heuristic or handwritten rules.

This entry was posted in News and tagged , , , , , , , , , , , . Bookmark the permalink.