Big Data News – 26 Aug 2015

Top Stories
Hortonworks' newest acquisition is a prelude to creating an open-source-based data flow management product.

Intel is leading a $20 million investment in BlueData as a way to speed deployments of big data systems among more users.

Automation will create jobs and destroy them, transforming work along the way. Unfortunately for humans, research firm Forrester anticipates more jobs being lost than being created in the next decade.

The potential for data analytics to disrupt healthcare delivery is large, and getting larger by the day. But in many cases, the need to hammer data into a structured format creates a barrier to productivity. Now a hospital chain in New York City is hoping to change that by adopting a Hadoop-based semantic data lake. Located in the Bronx, Montefiore Health System is the first hospital to implement a semantic data lake as part of the New York City Clinical Data Research Network (NYC-CDRN), an association of seven hospitals in the NYC area that are sharing data. As the pioneer, Montefiore is working with several technology providers, including Intel and Franz, to test a big data system capable of delivering precision medicine.

It would be difficult to come up with a better illustration of the profound effect data can have on people's lives than the Ashley Madison hack, which has not only sparked numerous lawsuits but also been associated with several suicides. On Tuesday, many of the world's experts in computer science and mathematics spent an afternoon at the Heidelberg Laureate Forum in Germany trying to figure out how the widespread collection of data about consumers can be prevented from causing more harm in the future. "In the U.S., there are now states where jail sentencing guidelines are being set by data," said Jeremy Gillula, a staff technologist with the Electronic Frontier Foundation. "Data has a huge impact on people's lives, and that's only going to increase."

This article describes learning from XING on how to scale mobile development in a way so that as many teams as needed could contribute to development of mobile apps (on both iOS and Android platforms) and at the same time keep the apps consistent, stable and shiny. It summarizes all key decisions and structural changes they made in order to enable scaling mobile from 2 to 10 teams. By Alexey Krivitsky

Data security, compliance with financial regulations, and daily routines are hardly among the most exciting parts of an employee's day. Many acknowledge at least a passing understanding of their importance, but they nevertheless fall short of expectations at times. The people at the highest levels of administration know the critical need to execute these processes correctly, but it's the rank and file worker who creates the firm's greatest exposure. There are several key steps to making sure you've done everything possible to secure your assets.

This is unequivocally the era of big data but it is not necessarily the era of big data security. The terrifyingly massive breaches of major corporate databanks, including Home Depot, Target, Niemen Marcus, and most recently Ashley Madison, reveals that most big data collectors are not doing nearly enough to protect their precious information from prying eyes. Without serious changes in security protocols across the board from consumers to producers to vendors big data will only increase in its appeal as a target for malicious hackers.

Optimization is a frequently encountered problem in real life.  We need to make a decision to achieve something within a set of constraints, and we need to maximize or minimize our objective…

When the NFL season kicks off next month, fans will have new information at their disposal thanks to the league's Next Gen Stats program, which will provide previously-undisclosed data to fans like players' speed and acceleration. However, that information will be reaching them in part through Amazon's cloud, despite Microsoft's technology partnership with the American professional football league. According to Matt Swensson, the NFL's senior director of emerging products and technology, the league is happy with its Microsoft partnership and uses Microsoft technology for a number of things, but decided on Amazon Web Services for this project. "We have used AWS, I would say, more out of comfort for some of the engineers that we have employed," he said in an interview.

When the NFL season kicks off next month, fans will have new information at their disposal thanks to the league's Next Gen Stats program, which will provide previously-undisclosed data to fans like players' speed and acceleration. However, that information will be reaching them in part through Amazon's cloud, despite Microsoft's technology partnership with the American professional football league. According to Matt Swensson, the NFL's senior director of emerging products and technology, the league is happy with its Microsoft partnership and uses Microsoft technology for a number of things, but decided on Amazon Web Services for this project. "We have used AWS, I would say, more out of comfort for some of the engineers that we have employed," he said in an interview.

When the NFL season kicks off next month, fans will have new information at their disposal thanks to the league's Next Gen Stats program, which will provide previously-undisclosed data to fans like players' speed and acceleration. However, that information will be reaching them in part through Amazon's cloud, despite Microsoft's technology partnership with the American professional football league. According to Matt Swensson, the NFL's senior director of emerging products and technology, the league is happy with its Microsoft partnership and uses Microsoft technology for a number of things, but decided on Amazon Web Services for this project. "We have used AWS, I would say, more out of comfort for some of the engineers that we have employed," he said in an interview.

When the NFL season kicks off next month, fans will have new information at their disposal thanks to the league's Next Gen Stats program, which will provide previously-undisclosed data to fans like players' speed and acceleration. However, that information will be reaching them in part through Amazon's cloud, despite Microsoft's technology partnership with the American professional football league. According to Matt Swensson, the NFL's senior director of emerging products and technology, the league is happy with its Microsoft partnership and uses Microsoft technology for a number of things, but decided on Amazon Web Services for this project. "We have used AWS, I would say, more out of comfort for some of the engineers that we have employed," he said in an interview.

A large Australian Government department was recently looking to set up their internal analytics capabilities. I was invited to present at their kick-off meeting where their IT team proposed the approach of first bedding down a reporting infrastructure and then afterwards move into complex data analysis the phrase used was "we need to walk before we can run". Based on my experience in successfully setting up analytics in many organisations, I suggested a different approach. This approach takes into account the highly unstructured nature of analytics projects and the possibility that all analytics ideas may not be successful. It enables quick testing of analytics opportunities with the ability to move rapidly into production if needed. The approach speedily searches the data landscape to conceive, test and deploy analytics ideas and supports business in fact-based decision making.

A large Australian Government department was recently looking to set up their internal analytics capabilities. I was invited to present at their kick-off meeting where their IT team proposed the approach of first bedding down a reporting infrastructure and then afterwards move into complex data analysis the phrase used was "we need to walk before we can run". Based on my experience in successfully setting up analytics in many organisations, I suggested a different approach.

A large Australian Government department was recently looking to set up their internal analytics capabilities. I was invited to present at their kick-off meeting where their IT team proposed the approach of first bedding down a reporting infrastructure and then afterwards move into complex data analysis the phrase used was "we need to walk before we can run".

Amazon has been grabbing headlines this month, from its recent earnings report, which demonstrated a major revenue lift despite an increased discounting strategy, to an expose on the company's work culture that caused ripples across the Internet and around the water cooler. While many people may have bristled reading about working at a company that has been described as "bruising," if the company's recent revenue numbers mean anything, working with Amazon Web Services has become almost unavoidable.

Organizations don't have to be helpless in the face of emergencies and disaster events. Using the power of data-driven impact analytics, organizations can anticipate and simulate emergency scenarios to improve response decisions and reduce economic and human costs.

Hortonworks is looking to create a bigger presence for itself in the Hadoop marketplace with the acquisition of startup Onyara, which specializes in NiFi development for Apache.

Hortonworks is looking to create a bigger presence for itself in the Hadoop marketplace with the acquisition of startup Onyara, which specializes in NiFi development for Apache.

The Hadoop security project called Ranger supposedly was named in tribute to Chuck Norris in his "Walker, Texas Ranger" role. The project has its roots in XA Secure, which was acquired by Hortonworks, then renamed to Argus before settling in at the Apache Software Foundation as Ranger. When Hadoop started, it was a set of loosely coupled parts primarily used in the back end of the big Internet companies like Yahoo. These parts were wrapped into distributions and marketed as Hadoop by the likes of MapR, Cloudera, and Hortonworks.

Open source Hadoop distribution specialist Hortonworks wants to close the loop on predictive analytics, allowing it to turn what it calls the "Internet of Anything" into actionable insights. To get there, it announced today that it has signed a definitive agreement to acquire Onyara, creator and key contributor to the top-level Apache NiFi open source project. NiFi was born eight years ago as Niagarafiles, a National Security Agency (NSA) project for automating data flows among multiple computer networks, even when data formats and protocols differ. The agency released NiFi to open source via the Apache Software Foundation late last year as part of the NSA Technology Transfer Program. Onyara was founded last December by the engineers who were the key contributors Niagarafiles, and opened for business in March of 2015. NiFi became a top-level Apache project in July 2015.

Mobility, cloud and big data all promise to help enterprises increase efficiency and productivity, improve decision-making and lower costs. The laudable goal is to make your business more competitive, but for your IT, legal and compliance teams, these new technologies often lead to increased complexity, loss of control and even increased costs as massive amounts of data now move to an ever-increasing number of endpoints, including mobile devices and third-party hosting services. These challenges can be overcome with a new approach to standardizing information metadata. (Insider Story)




Business can learn a lot from sports in terms of using analytics to optimize talent decisions in a predictive and strategic way. From Billy Beane of the Oakland A's to Daryl Morey of the Houston Rockets, there's a new wave of analytics-consuming innovators across sports who appreciate the value of using data to gain and […] The post Winning Roles: Moneyball 2.0, for your Hiring and Succession Planning Processes appeared first on Predictive Analytics Times.

Not a week goes by without a major story about vulnerable data in the hands of the wrong people. Whether it's a data breech at a government agency or global retailer exposing financial information, cloud service providers like Microsoft, Google and Amazon battling government agencies over access to data, or potential HIPAA violations resulting from compromised health data, we are constantly reminded that, unfortunately, people are after our data and they've gotten pretty good at getting it. Despite this, cloud adoption is exploding, with businesses increasingly trusting third parties with their sensitive data.

Yester-year there were only a few unicorns in the world of startups. This week though, the Wall Street Journal and Dow Jones VenturSource identified 115 companies with valuations north of $1 billion, which are referred to as unicorns. Below are 15 of the highest valued enterprise software companies that have received venture funding but have not yet been sold or gone public.

Mobility, cloud and big data all promise to help enterprises increase efficiency and productivity, improve decision-making and lower costs. The laudable goal is to make your business more competitive, but for your IT, legal and compliance teams, these new technologies often lead to increased complexity, loss of control and even increased costs as massive amounts of data now move to an ever-increasing number of endpoints, including mobile devices and third-party hosting services. These challenges can be overcome with a new approach to standardizing information metadata. (Insider Story)

The Industry 4.0 and Industrial Internet of Things journey is about business model transformation. What's more, the journey is just beginning and the future is exciting.

Solutions around today's analytic ecosystem are too technically driven without focusing on business values. The buzzwords seem to over-compensate the reality of implementation and cost of ownership. I challenge you to view your analytic architecture using pluralism and secularity. Without such a view of this world your resume will fill out nicely but your business values will suffer. In my previous role, prior to joining Teradata, I was given the task of trying to move "all" of our organization's BI data to Hadoop.

Any cloud provider that believes in data gravity is trying to make it easier to collect and store data in their facilities. To make data movement between cloud and on-premises endpoints easier, Microsoft recently announced the general availability of Azure Data Factory (ADF). By Richard Seroter

by Ari Lamstein, consultant specializing in software engineering and data analysis and author of the free email course Learn to Map Census Data in R. One of my favorite things about R is that it…

A preview release of the Apache Spark open source in-memory processing framework incorporates major performance upgrades, according to Databricks Inc., the big data processing company founded by Spark's creators. Databricks said Aug. 18 is expects to release Apache Spark 1.5 in a "few weeks," adding the preview release would allow the Spark community to conduct quality assurance testing. Databricks said Spark 1.5 would focus on "under-the-hood changes to improve Spark's performance, usability and operational stability."

There's never been a more exciting time to be a marketer than right now. On the one hand, our jobs are growing increasingly complex, thanks to the seemingly endless streams of rich, relevant customer data, the proliferation of digital channels and platforms, and the constant evolution of marketing technologies. But on the other, it's the combination of those things that make what we do so effective and rewarding. Make no mistake about it: Today's marketers have unprecedented capabilities to deliver individualized interactions and compelling customer experiences and to me, that's absolutely thrilling. Still, amidst all this transformation, I'm finding that many are losing sight of one of the most fundamental mandates of our industry.

Salesforce is delivering a cleaner, more elegant look for its desktop CRM applications by using its mobile platform's Lightning components.

Salesforce is delivering a cleaner, more elegant look for its desktop CRM applications by using its mobile platform's Lightning components.

The advantages of wind power, advancements in turbine technologies, declining renewable energy costs and the corporate sector's newfound interest in this power source will provide the impetus to push wind energy into all 50 states by 2050, when wind could power as many as 100 million homes, according to the U.S. Department of Energy.

Brick-and-mortar shops, like their online counterparts, now have access to a wealth of data on consumer tastes and shopping habits inside and outside the walls of their stores. Through a holistic analysis of these data points, retail managers can optimize operations to prosper with a leaner staff that better predicts and reacts to the flow of foot traffic.

Open source Hadoop distribution specialist Hortonworks wants to close the loop on predictive analytics, allowing it to turn what it calls the "Internet of Anything" into actionable insights. To get there, it announced today that it has signed a definitive agreement to acquire Onyara, creator and key contributor to the top-level Apache NiFi open source project.

The future of retail marketing is brick-and-mortar stores, but not as they exist today. Instead, think about stores that bring the e-commerce experience into the physical realm and obliterate the barriers between online and in-store shopping. Think about, "2025: A Retail Space Odyssey."

Although we are now in the 21st century, you would not know it by the look of most offices. Worse still, the idea of a smart office is often ridiculed as utopian. But, in fact, everything necessary for a smart office is available. You can buy everything you need. You can buy it online or offline. You can buy the best hardware and the latest software without a problem. You can decorate with enthusiasm from a wide variety of furniture or art.

All businesses that own/control websites, blogs and web and/or mobile applications know the importance of having web analytics tracking code. Nonetheless, the implementation of said systems is more complex than most imagine it to be, leading to major and minor flaws in the instrumentation of the sites and applications as a result.

Most people (including myself) are drawn to Julia by its lofty goals. Speed of C, statistical packages of R, and ease of Python? It sounds two good to be true. However, I haven't seen anyone who has looked into it say the developers behind the language aren't on track to accomplish these goals. Having only been around since 2012, Julia's greatest disadvantage is a lack of community support.

This entry was posted in News. Bookmark the permalink.