Big Data News – 08 Jul 2016

Today's Infographic Link: Choose Your Weapon: The Global Arm Trade

Featured Article
Take one major trend spanning the business and technology worlds, add countless vendors and consultants hoping to cash in, and what do you get? A whole lot of buzzwords with unclear definitions. In the world of big data, the surrounding hype has spawned a brand-new lingo. Need a little clarity? Read on for a glossary of sorts highlighting some of the main data types you should understand. 1. Fast data The shining star in this constellation of terms is "fast data," which is popping up with increasing frequency. It refers to "data whose utility is going to decline over time," said Tony Baer, a principal analyst at Ovum who says he coined the term back in 2012.

Top Stories
At first glance, enterprise resource planning (ERP) software and big data analytics don't appear to have much in common. But it turns out that big old ERP systems have a thing or two to learn from their newer, nimbler brethren. One of the ERP companies exploring how to incorporate big data analytics into ERP software is Infor, which is the world's third or fourth largest ERP vendor, behind the likes of Oracle, SAP, and (sometimes) Sage. Two years ago, Infor co-president Duncan Angove set up the Dynamic Science Labs (DSL) with the goal of using data science techniques to solve a certain class of business problems for its customers.

I'm joined by Chris Stucchio this week to discuss how deliberate or uninformed statistical practitioners can derive spurious and arbitrary results via multiple comparisons. We discuss p-hacking and a variety of other important lessons and tips for proper analysis. You can enjoy Chris's writing on his blog at chrisstucchio.com and you may also like his recent talk Multiple Comparisons: Make Your Boss Happy with False Positives, Guarenteed.

In case you missed them, here are some articles from June of particular interest to R users. A preview of the tutorials presented at the useR! 2016 conference. A "advanced beginner's" guide to R…

As organizations shift towards IT-as-a-service models, the need for managing and protecting data residing across physical, virtual, and now cloud environments grows with it. Commvault can ensure protection, access and E-Discovery of your data — whether in a private cloud, a Service Provider delivered public cloud, or a hybrid cloud environment — across the heterogeneous enterprise. In his general session at 18th Cloud Expo, Randy De Meno, Chief Technologist – Windows Products and Microsoft Partnerships at Commvault, discussed how to cut costs, scale easily, and unleash insight with CommVault software, the only singular data and information management solution for cloud data protection and beyond.

Wherever you turn, businesses are putting analytics into action. Retailer American Eagle outfitters, for example, uses an algorithm to figure out how best to fulfill online orders with products shipped from physical stores. Insurance company Allstate calculates premiums using an algorithm that weights different risk factors. Even beverage maker Minute Maid is applying algorithms to its orange juice, taking into account not just consumer preferences but its supply chain.

Cloud computing continues grabbing headlines as one of the biggest technology trends gains momentum. Businesses and individuals have come to realize the benefits of the cloud, including convenience and cost savings, so they increasingly choose cloud services to stay competitive.

The jury is still out on whether mobile device screen real estate is large enough for comprehensive BI dashboards. But other devices like tablets are undoubtedly rising to meet the challenge. Let's examine some key factors driving mobile adoption in the analytics marketplace.

In 2015, varying with the type of identity theft, identity thieves harmed eighteen million people only in the USA. That's approximately twofold increase compared to 2014.

European Union officials are set to give final approval to a new EU-U.S. data transfer agreement early next week, after member states gave their approval to an updated text on Friday. Privacy Shield is intended to replace the Safe Harbor Agreement as a means to legalize the transfer of EU citizens' personal information to the U.S. while still respecting EU privacy laws. A new deal is needed because the Court of Justice of the EU invalidated the Safe Harbor Agreement last October, concerned that it provided Europeans with insufficient protection from state surveillance when companies exported their personal data to the U.S. for processing.

Nucleus Research conducted in-depth interviews with some 40 SAP customers to prepare a report which found that six out of 10 would not buy SAP again. Here's a closer look what it means.

In this special technology white paper, Transforming Business Decision Making with a Cloud-Native Data Warehouse, you'll learn about how organizations are increasingly dependent on different types of data to make successful business decisions.

SAP SE (NYSE: SAP) unveiled SAP® Geographical Enablement Framework, powered by SAP HANA®, which helps organizations enrich business applications with geographic data from geographic information systems (GIS), such as Esri ArcGIS.

European Union officials are set to give final approval to a new EU-U.S. data transfer agreement early next week, after member states gave their approval to an updated text on Friday. Privacy Shield is intended to replace the Safe Harbor Agreement as a means to legalize the transfer of EU citizens' personal information to the U.S. while still respecting EU privacy laws. A new deal is needed because the Court of Justice of the EU invalidated the Safe Harbor Agreement last October, concerned that it provided Europeans with insufficient protection from state surveillance when companies exported their personal data to the U.S. for processing. The first draft of Privacy Shield agreement presented by the European Commission in January lacked key assurances from U.S. officials on the same matters that had concerned the CJEU about Safe Harbor.

Google has acquired Anvato, the developer of a video software platform, to shore up its cloud offering in the area of video delivery. Anvato's Media Content Platform, used by many large media and entertainment companies including NBCUniversal, Univision and Fox Sports, will complement the efforts of Google Cloud Platform to offer scalable media processing and workflows in the cloud, Belwadi Srikanth, senior product manager for Google Cloud Platform, wrote in a blog post. The Mountain View, California, company offers software that automates the encoding, editing, publishing and secure distribution of video content across a variety of platforms.

Britain's vote to exit the European Union will depress global IT spending this year, as companies cut back spending over uncertainty about what the future holds, market research firm Gartner said. "We're looking at a 2-5 percent reduction in IT spending in the UK," compared to Gartner's previous forecast of a 1.7 percent decline, analyst John-David Lovelock said in an interview. "That's going to be enough to tip the worldwide IT spending negative this year," he said. The "Brexit" vote has led to massive uncertainty about what the future holds economically for Britain and the entire European Union, which means that companies have scaled back on their strategic plans in the fifth-largest IT market in the world.

A journey into predictive analytics and how data will drive up airline ticket prices. As airlines and frequent flyer programs gather more intelligence on your day to day lifestyle, flying and financial position – they begin to build a data profile on your interests, goals, psychometric assessment, your motivations to engage with a brand at any given point throughout the day, what has driven you to purchase in the past — and most importantly — where your thresholds are. To illustrate how data is playing a growing role in todays flight booking engines I've broken down play by play how each individual piece of data collected about you can be used, analysed and overlaid with other data sets to paint a picture of who you are, what motivates and drives you to purchase a specific product.

What does a data scientist really do?

When digital laggards finally recognize the degree of change digital technologies will force upon their businesses, and desperately try to outrun the inescapable Darwinian effect of their slow start, they will be faced with not one, but three ages of digital transformation to navigate and survive. Understanding these ages, and what is unique about each one, is critical for business strategy, prioritizing, planning, sequencing and budgeting.

Most organizations prioritize data security only after their data has already been compromised. Proactive prevention is important, but how can you accomplish that on a small budget? Learn how the cloud, combined with a defense and in-depth approach, creates efficiencies by transferring and assigning risk. Security requires a multi-defense approach, and an in-house team may only be able to cherry pick from the essential components. In his session at 19th Cloud Expo, Vlad Friedman, CEO/Founder of Edge Hosting, will discuss what questions to ask and the technologies to look for from your cloud service provider to ensure your applications stay online and secure.

With 15% of enterprises adopting a hybrid IT strategy, you need to set a plan to integrate hybrid cloud throughout your infrastructure. In his session at 18th Cloud Expo, Steven Dreher, Director of Solutions Architecture at Green House Data, discussed how to plan for shifting resource requirements, overcome challenges, and implement hybrid IT alongside your existing data center assets. Highlights included anticipating workload, cost and resource calculations, integrating services on both sides of the firewall, self-service, monitoring, and workload prioritization.

This report from Gartner offers a view into the next generation of application development technologies, showing how agile practices, RAD and state-or-the-art user experiences are high-profile challenges for today's application development leaders. It showcases a number of vendors in these new spaces, including test company Rainforest.

"We provide DevOps solutions. We also partner with some key players in the DevOps space and we use the technology that we partner with to engineer custom solutions for different organizations," stated Himanshu Chhetri, CTO of Addteq, in this SYS-CON.tv interview at DevOps at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.

Britain's vote to exit the European Union will depress global IT spending this year, as companies cut back spending over uncertainty about what the future holds, Gartner has predicted. "We're looking at a 2-5 percent reduction in IT spending in the UK," compared to Gartner's previous forecast of a 1.7 percent decline, analyst John-David Lovelock said in an interview. "That's going to be enough to tip the worldwide IT spending negative this year," he said. The "Brexit" vote has led to massive uncertainty about what the future holds economically for Britain and the entire European Union, which means that companies have scaled back on their strategic plans in the fifth-largest IT market in the world.

Monday newsletter published by Data Science Central.

Ask someone to architect an Internet of Things (IoT) solution and you are guaranteed to see a reference to the cloud. This would lead you to believe that IoT requires the cloud to exist. However, there are many IoT use cases where the cloud is not feasible or desirable. In his session at @ThingsExpo, Dave McCarthy, Director of Products at Bsquare Corporation, will discuss the strategies that exist to extend intelligence directly to IoT devices and sensors, freeing them from the constraints of the cloud. This enables companies to realize their business objectives by creating an architecture that is distributed, fast and cost-effective.

Cloudera Director is a manifestation of Cloudera's commitment to provide a simple and reliable way to deploy, scale, and manage Apache Hadoop in the cloud of your choice. Cloudera Director enables you to deploy production-ready clusters for big data applications and successfully run workloads in the cloud. With Cloudera Director 2.1, we are extending our capabilities in supporting transient workloads and the ability to deploy into multiple cloud environments. The key themes of Cloudera Director 2.1 are: Usage-based billing for Cloudera services Ability to deploy Cloudera Manager and CDH clusters on Microsoft Azure Ability to deploy clusters across cloud providers or regions In this post, The post What's New in Cloudera Director 2.1? appeared first on Cloudera Engineering Blog.

There has been so much emphasis on how big data is being used in industries like banking, security, and especially marketing. Less is said about how it is being used in the marketing industry. Yet big data's impact here can be seen from procurement of raw materials to final delivery of the finished product.

Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance — how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders — from customers to the board — will be able to understand and comprehend.

In less than three minutes, this demo offers another way of understanding your customers and managing their portfolios. When clients want to receive personalized advice and tailored portfolio recommendations, how can banks use data to not only enhance the customer experience but also boost customer profitability? The IBM Customer Insight for Banking solution provides client segmentation based on behavioral profiles while also using unstructured combined with structured data to predict life events, helping banks deliver customized advice. Learn more about IBM Customer Insight for Banking: https://www.ibm.com/marketplace/cloud/banking-customer-insights/us/en-us

Hadoop and its related projects have given the database industry a wave of innovation unlike anything we've seen since Relational Database Management Systems (RDBMS) emerged in the late 1970s and early 1980s.  I expect RDBMS will continue to dominate the world of transactional systems, where the relational model / Third Normal Form schemas are perfectly suited for the nature of transaction processing.  Analytical systems, on the other hand, can clearly benefit from Hadoop with its ability to scale endlessly, which accommodates the ever-expanding volume of data to be analyzed, and its schema-on-read paradigm, which accommodates the unstructured nature of sentiment data along with the inevitable changes to what data is collected and how it's analyzed. 

One quarter of U.S. smartphone owners use payment apps at least once monthly and more than 3 million retailers use Apple Pay and Android Pay.

Increasingly, IT administrators are integrating data center-grade storage systems with Hadoop — ones that come with the required data protection, security and governance built-in.

Open Source Storage unveils 3u storage appliance capable of providing access to up to 16TB of storage using open source such as Zettabyte File System.

Open Source Storage unveils 3u storage appliance capable of providing access to up to 16TB of storage using open source such as Zettabyte File System.

Don't miss your chance to get acquainted with the dark web in this live panel discussion hosted by IBM's Bob Stasio. When you do, you'll hear security experts' latest thoughts about everything from private-sector entities' place in the dark web to shutting down the dark web entirely.

Companies are quickly ramping up their investments fast data analytics and real-time stream processing frameworks and lowering spending on batch technologies in an attempt to get on top of growing data volumes and velocities, a new survey says. According to OpsClarity's 2016 State of Fast Data & Streaming Applications Survey, 89 percent of more than 4,000 survey respondents say they're currently using batch analytics, compared to 65 percent who say they're using "near real-time pipelines." Looking forward, 92 percent of survey respondents indicate they plan on increasing their investments in streaming data applications over the next year, while nearly 79 percent say they planned on reducing or eliminating investments in batch processing.

Deep data science is a branch of data science that has little if any overlap with closely related fields such as machine learning, computer science, operations research, mathematics, or statistics. Even classical machine learning and statistical techniques such as clustering, density estimation,  or tests of hypotheses, have model-free, data-driven, robust versions designed for automated processing (as in machine-to-machine communications), and thus these techniques also belong to deep data science.

Until now, enterprise developers have faced challenges in achieving scale and maintaining high performance with graph databases. Enterprise development teams could not guarantee the availability of graph data stores as a reliable cloud service, particularly as an always-on distributed property graph. Available now in beta, IBM Graph was designed to help overcome these challenges and bring advanced graph database capabilities to the enterprise.

Learn how Velan Inc. achieved significant operational efficiencies and measurable cost savings from finance transformation by leveraging IBM Cognos Controller and IBM Cognos TM1.

Organizational variable compensation teams are looking for solutions to efficiently design and manage complex compensation programs for sales commissions, bonuses, managed business objectives and noncash rewards. Take a look at an IBM-commissioned Forrester Consulting Total Economic Impact study that examines the value organizations may realize by deploying the IBM Incentive Compensation Management solution.

Gartner's updated IT spending forecast for 2016 is slightly improved, but the research firm warned that it doesn't account for the recent UK vote to leave the European Union. The Brexit vote will dampen IT spending worldwide.

One could be forgiven for experiencing a bit of hopeful, skepticism in response to U.S. President Barack Obama Administration's statement in May regarding re-energizing the "War Against Cancer." The war against cancer is a many-decades old effort with mixed results — great progress in many areas but matched with disappointment in others. Winning the war still seems rather far-off. The new effort, to be led by Vice President Biden, is part of Obama's Precision Medicine Initiative. Precision Medicine is a powerful idea — mostly marshaling insights from varying genomics technologies and research (writ large) to "personalize" and improve the efficacy of therapy for very many illnesses.

Glassdoor has identified the top 25 cities for software engineer salaries by calculating the real adjusted salary for each city.

The innovation manager at Transport for London laid out his approach to emerging technologies yesterday, from IoT to VR and what he plans to do with the increasing amount of data they are collecting to make commuters lives better.

Redis is not only the fastest database, but it is the most popular among the new wave of databases running in containers. Redis speeds up just about every data interaction between your users or operational systems. In his session at 19th Cloud Expo, Dave Nielsen, Developer Advocate, Redis Labs, will share the functions and data structures used to solve everyday use cases that are driving Redis' popularity.

  Summer is here, but fall is right around the corner. Plan ahead now and save on these Predictive Analytics World conferences this October. Early bird savings are still available, and DSC readers save an additional $150 on all PAW conference passes with code DSCPAW150.  Predictive Analytics World for Government October 17-20 Washington, D.C. See how government agencies are applying predictive analytics to improve policing, manage risk, detect fraud, and optimize resource distribution.




Among many decisions you'll have to make when building a predictive model is whether your business problem is a classification or approximation task. It's important as…

Power BI Embedded and Publish to Web moving to GA, Cortana Intelligence with Bing Predicts entering preview, and Microsoft Data Science Summit — co-located in Atlanta with Ignite — announced.

Do you know what p-hacking is? John Oliver – of HBO's "Last Week Tonight" and formerly of "The Daily Show with Jon Stewart" – does. It's a tricky, potent analytical pitfall that's gaining increased, deserving attention – across fields of science and even within Predictive Analytics Times articles and Predictive Analytics World sessions. "An orange… The post HBO Teaches You How to Avoid Bad Science appeared first on Predictive Analytics Times.

I mentioned a few of the useR tutorials that I had the opportunity to attend. Here are the links to the slides and code for all but one of the tutorials: Regression…

I sat in the waiting chair for 40 minutes while he finished with another client. "Hey Dave, how've ya been?" "Nothing to complain about." I sat in the barber's chair and without hesitation he said…

IBM (NYSE: IBM) announced a new version of its industry leading database software, IBM DB2 V11.1 to help developers easily bridge on-premise applications to the cloud and enable a hybrid data architecture.

IT organizations will be looking for a centralized console to manage public and private cloud computing deployments.

Data scientists are in high demand, but in such scarce supply that some companies outsource their data for analysis. DataScience Inc. CEO Ian Swanson explains how it works.

Testing is an often overlooked yet critical component of any software system. In some ways this is more true of models than traditional software. The reason is that computational systems must function correctly at both the system level and the model level. This article provides some guidelines and tips to increase the certainty around the correctness of your models. Guiding Principles One of my mantras is that a good tool extends our ability and never gets in our way. I avoid many libraries and applications because the tool gets in my way more than it helps me. I look at tests the same way. If it takes too long to test, then the relative utility of the test is lower than the utility of my time. When that happens I stop writing tests.

More than one-third of collected data is considered useless, and if the trend of indiscriminately storing data continues, it will cost businesses $3.3 trillion by 2020.

Storing and managing such a humongous volume of data is no easy task and how Google handles this data is a lesson for anybody who deals with cloud and big data.

As both digital and more traditional companies become more and more dependent on data to compete in today's information economy, data is starting to have an irrefutable impact on companies' valuation and reputation. The decisions companies make about how to use data can have an enormous impact on the success of modern enterprises, as well as on their image, their public perception, their competitors, and regulators. According to a recent research, companies must recognize this new reality in which corporate reputations may be negatively impacted by decisions they make concerning data within their control. As companies are incurring significant costs to capitalize on the enormous amounts of data — so-called Big Data — constantly generated by the Internet of Things (IoT), social media platforms, websites, and other sources, they must appreciate that their use, misuse and governance of data can have a direct impact on their goodwill and ultimate valuation.

It's midyear already. Halftime. Two quarters down and two to go for 2016. For many financial institutions, it's also the start of the 2017 strategic planning season. That time when boards and senior leadership teams sit down to hash out next year's budget; working together to set aside silo politics and internecine battles to focus on what's best for the customers, and the overall enterprise as a result. Haha– just kidding!

This entry was posted in News and tagged , , , , , , , , , , . Bookmark the permalink.