Big Data News – 6 Oct 2015

Top Stories
IBM lead researcher Shu-Jen Han explains the significance of finding an effective connector for 7-nanometer nanotube transistors.

A Gartner analyst uses the vision of a workplace without associates as a great jumping-off point to talk about how to integrate technology into the brick-and-mortar store of tomorrow.

Gartner's CIO Symposium Welcome Keynote advised CIOs that we're entering the age of the algorithm.

Clusterpoint, a NoSQL database vendor providing Database-As-A-Service (DBAAS), announced the availability of Clusterpoint 4, an evolution of its database software now becoming a computing engine. This unique platform combines an instantly scalable document-oriented database and computing infrastructure so businesses can process and analyze large amounts of data in real-time at a reasonable cost.

With these words, the European Court of Justice (ECJ) has just ruled the "Safe Harbor" agreement  between the US and EU

One of the NoSQL world's more venerable products delivers its 4.0 release to general availability today. The new version provides secondary indexes, geo-spatial indexes, workload isolation, and a new storage engine.

SharePoint is often the main storage area for documents and forms, but it doesn't contain data from various business units where information is likely siloed within other systems. The modern company needs applications that can pull data from many different sources and then integrate it into a more usable form within SharePoint.

Traditional BI is far too slow to be of use to machines. Matt Asay explains.

At its .conf2015 users conference in Las Vegas yesterday, operational intelligence specialist Splunk took the wraps off a new version of its Splunk Enterprise platform and a new premium offering, Splunk IT Service Intelligence. Splunk Enterprise 6.3 — designed for on-premises, cloud or hybrid deployment — is focused on enhancements to performance and total cost of ownership as well as high-volume event collection for DevOps and Internet of Things (IoT) devices. In many cases, says Clint Sharp, Splunk director of product management, Big Data & Operational Intelligence, the hardware cost of a Splunk Enterprise 6.3 deployment can be cut in half compared with Splunk Enterprise 6.0.

Two of the hottest segments in IT are big data and analytics. This video shows five companies that presented products in this space at DEMO Traction, 2015.

IBM will acquire object-based storage vendor Cleversafe in a move to bolster its cloud business unit with more flexibility and simplified management options in the hybrid cloud. Founded in 2004, Chicago-based Cleversafe offers large-scale content repository, backup, archive, collaboration and storage as a service. The acquisition aims in particular to help companies tackle growing volumes of unstructured data such as audio, video and images, IBM said Monday. Object storage such as what Cleversafe provides can offer more efficient storage of massive amounts of data while also meeting the demands of data-intensive workloads delivered via the cloud.

Three companies focused on enterprise products present at DEMO Traction, September 2015.

Today a study will come out saying that Spark is eating Hadoop — really! That's like saying SQL is eating RDBMSes or HEMIs are eating trucks. Spark is one more execution engine on an overall platform built of various tools and parts. So, dear pedants, if it makes you feel better, when I say "Hadoop," read "Hadoop and Spark" (and Storm and Tez and Flink and Drill and Avaro and Apex and …). The major Hadoop vendors say Hadoop is not an enterprise data warehouse (EDW) solution, nor does it replace EDW solutions. That's because Hadoop providers want to co-sell with Teradata and IBM Netezza, despite hawking products that are increasingly eating into the market established by the big incumbents.

Three companies focused on enterprise products present at DEMO Traction, September 2015.

Financial institutions in Europe may face tougher rules governing their use of big data thanks to a new investigation by financial regulators. Focusing on the "opportunities and challenges" associated with big data, the new investigation aims to determine whether new regulatory or supervisory measures are needed, according to a joint statement published Monday by the European Securities and Markets Authority, the European Banking Authority and the European Insurance and Occupational Pensions Authority.

Three companies leverage big data, predictive analytics and social signals to help recruiting organizations find, qualify and engage with in-demand talent.

Fundamental data technologies driving businesses have changed over the past decade, what capabilities will they be opening up over the next decade? We find out.

Added by Divya Parmar on September 28, 2015

At SAP TechEd Barcelona, November 10-12, there are two special pre-conference events that you will want to add to your calendar. 1. #DataGenius Jurgen Faisst Half-Day Workshop — November 9, 9:00 to 13:00 2. #DataGenius Viz a Thon with Sustainable Amazonia Foundation — November 9, 13:00 to 17:00 1: #DataGenius Jurgen Faisst Half-Day Workshop Do…

I lead what could only be described as a very busy work schedule. In between my blogging commitments, customer meetings, admin duties and working on designing new architectures I then try to find a few hours each week to continue my research efforts towards a PhD in analytics. Needless to say that my desk usually has a few empty coffee cups, half empty cans of Redbull and a monitor stand made out of empty pizza boxes… Maybe there could still be a few slices left in them? So a few weeks ago management here at Teradata asked me if I wanted to attend our Teradata PARTNERS event from the 18-22 October in Anaheim CA. PARTNERS is where a lot of the marketing hype is cut out and replaced by presentations delivered by our customers to our customers.

If you've ever used Uber, you're aware of how ridiculously simple the process is. You press a button, a car shows up, you go for a ride, and you press another button to pay the driver. But there's a lot more going on behind the scene, and much of that infrastructure increasingly runs on Hadoop and Spark, as the Uber data team recently shared. Uber has the envious position of sitting at the junction of the digital and physical worlds. It commands an army of more than 100,000 drivers who are tasked with moving people and their stuff within a city or a town. That's a relatively simple problem.

IBM will acquire object-based storage vendor Cleversafe in a move to bolster its cloud business unit with more flexibility and simplified management options in the hybrid cloud. Founded in 2004, Chicago-based Cleversafe offers large-scale content repository, backup, archive, collaboration and storage as a service. The acquisition aims in particular to help companies tackle growing volumes of unstructured data such as audio, video and images, IBM said Monday. Object storage such as what Cleversafe provides can offer more efficient storage of massive amounts of data while also meeting the demands of data-intensive workloads delivered via the cloud.

For more than six years, the New York Times has been using the R language to develop and implement much of the fantastic data journalism on the website and in the newspaper. A few months ago graphics editor Amanda Cox was interviewed for the Data Stories podcast, where she described the process for creating the interactive data visualizations at the Times. Some highlights of the podcast include Amanda describing R as "The greatest software on Earth", and the background behind the visualization below which tells a unique story based on where you live.

Author and consultant Dale Neef talks about how the shortage of big data skills is keeping small and mid-sized businesses from cashing in on analytics.

Second-guessing over the relative accuracy of the U.S. weather forecasting model compared to its European counterparts resumed last week as early predictions that Hurricane Joaquin would strike the U.S. east coast proved inaccurate. With the exception of the sodden residents of some southeastern coastal states, others were mostly spared after the hurricane drifted out into the Atlantic Ocean after pounding the Bahamas. Early models from the U.S. National Weather Service had the Category 4 storm hitting the east coast anywhere from the North Carolina to New England. Late last week, the U.S. model had the Mid-Atlantic region in the storm's crosshairs. The storm's behavior hued closer to models and forecasts from the European Center for Medium-Range Weather Forecasts, which unlike the American model consistently forecast Joaquin would turn away from the east coast and drift out to sea.

Joseph George from HP presented this talk at the recent HPC User Forum. "This paper describes the HP Big Data Reference Architecture (BDRA) solution and outlines how a modern architectural approach to Hadoop provides the basis for consolidating multiple big data projects while, at the same time, enhancing price/performance, density, and agility. HP BDRA is a modern, flexible architecture for the deployment of big data solutions; it is designed to improve access to big data, rapidly deploy big data solutions, and provide the flexibility needed to optimize the infrastructure in response to the ever-changing requirements in a Hadoop ecosystem."

No matter how big the data actually gets, the consumption of that data, much like politics, is always going to be local.

Don't find yourself falling prey to pitfalls when you implement a predictive analytics solution for your organization. Discover what mistakes to avoid when planning your implementation and setting strategy.

Many experts and economists argue that governments employing data and analytics wield a powerful advantage for improving teir nations and cities.

OnCorps today unveiled its Adaptive Decision Analytics platform that intelligently engages users and nudges them to make better decisions.

Cyber attacks are becoming more common not only in the news, but in our personal lives as well. But are there ways to identify the perpetrators or deter these attacks before they occur? Fortunately, the emergence of powerful new analytics-based technologies can help cyber defenders expose those who intend online disruption.

From the West Coast, at the GE Minds + Machine Conference in San Francisco, comes news that Hortonworks and Neustar have partnered on registering, securing and managing Internet of Things devices, and analyzing IoT data. Specifically, the two companies said their collaboration "relates to device registry, security, policy management and enforcement." They said some of the results will be "shared with the open-source Apache community in order to accelerate mass adoption of IoT devices and technology."

Data Driven marketing is on its way to becoming the standard in all industries. A 2015 study by the Global Direct Marketing Association and Winterberry group found that 92% of surveyed marketing professionals believe that data will be an important factor in the future of marketing.

Ovum principal analyst Tony Baer provides a summary and analysis of what he learned at Strata + Hadoop World in New York last week.

The term IoT (Internet of Things) has become a popular topic in the data analytics world. Nowadays almost everything can be equipped with a sensor, a processor, and a network connection, collecting and broadcasting ever more data into the world. While much of the discussion about IoT centers around retail-based smart systems (i.e., thermostats, alarms, household appliances, cars, etc.), one of the areas where IoT may have the biggest impact is manufacturing. A recent Intel Blueprint entitled "Internet of Things (IoT) Delivers Business Value to Manufacturing"1 demonstrates how Big Data analytics can be used to optimize manufacturing processes that result in improved quality, increased throughput, better insights into shop floor issues, and reduced downtime. The advantage of big data analytics over more conventional relational (RDBMS) can be found in the ability to manage and analyze unrelated data sources from disparate locations. For example, many different types of data are available from a typical production process.

Amazon's DynamoDB is outpacing every other major database in the popularity rankings. Will it win your cash next?

So you have gathered your data and completed your exploration and cleansing. You labored countless hours transforming the data and created a strong model that can revolutionize the way your company sees its clients, makes decisions and competes in your market. Yet, for all of your efforts, the model roll out is stalled. The numbers… The post Personalities That Are Barriers to Model Deployment (And How to Partner With Them) Part I: The Early Adopter appeared first on Predictive Analytics Times.

One of the hardest things for any industry to accept is that it can and will be disrupted. The tendency, especially in industries that have existed in much the same way for decades, is to think that processes have been perfected over time and that business will continue as usual. Even when new technologies arise loaded with data collection capabilities and analytics, many see them only as a means to add efficiencies rather than as a mode of change. And so it was that the message of impending disruption and how to adapt came as a surprise to some at the NY Academy of Sciences Mobile Health conference, and as welcomed information to others.

As the global business environment changes thanks to heightened connectivity and interoperability, the product development process can no longer remain the same. Rather, companies must adopt engineering practices designed to fit a connected world by applying analytics to operational data.

Mendix CTO Johan den Haan says that with the release of Mendix 6, both citizen and professional developers can easily construct mobile applications.

At Strata, Objectivity announced it will enable users to push the fusion workflow from its recently-announced Information Fusion platform, called ThingSpan, into Intel's Trusted Analytics Platform. TAP is Intel's open-source project designed to aid data scientists in deploying big data analytics easier and faster.

Although cramming — a company fraudulently charging someone's phone bill — technically is the fault of a third party, consumers assume that the charge is due to their carriers' incorrect billing management. Therefore, they typically direct their frustration and mistrust toward their telecommunications company.

The latest release of SnapLogic's Elastic Integration Platform as a Service, unveiled at Strata, includes new capabilities for big data integration including Spark data processing, a new Snap for Cassandra and support for Microsoft Cortana Analytics.

In an earlier installment of The QbD Column titled A QbD factorial experiment, we described a case study where the focus was on modeling the effect of three process factors on one response, viscosity. Here, we expand on that case study to show how to optimize process parameters of a product by considering eight responses and considering both their target values and the variability in achieving these targets.

Audiences can be exquisitely fickle. What they respond to at any given moment is, if not entirely unpredictable, highly resistant to tidy formulas. Predictive audience analysis can be vital for personalization and targeting and engaging customers. Discover how data-scientific tools for performing predictive audience analysis can be applied in many sectors to personalize and contextualize audience engagement strategies in real time that help drive the insight economy.

Leonard Sacks, associate director of clinical methodology at the U.S. Food and Drug Administration, gave a very interesting presentation at the N.Y. Academy of Sciences's "Mobile Health: The Power of Wearables, Sensors and Apps to Transform Clinical Trials" event. The big (and most welcomed) surprise to me was his assertion that "there are no regulatory restrictions to using these technologies." Here's what else he had to say regarding the collection of patient data through new devices and apps:

Up to a matter of months ago, the sole aim of a retailer's mobile strategy was to drive traffic to their website. Now look where we are — real, live, mCommerce — with brands taking full advantage of the commercial opportunities that data analytics and mobile apps can offer. Things like personalised in-store offers (based on a customer's mobile profile and delivered, personally, at the transaction stage), branded wallets (that allow customers to manage their store accounts and check loyalty points), plus a-hundred-and-one other data-driven, in-store, mobile, and social innovations.




Bernard Munos gave a very insightful presentation on creative disruption in clinical trial research and the societal impact of mobile biosensor technologies on human health at the NY Academy of Sciences Mobile Health conference. The highlights of his speech are below.

Published Date: 2015-10-05 13:01:33 UTC Tags: Analytics, Big Data, Chief Analytics Officer, Data Science, Predictive Analytics, Sports Analytics Title: Associations and Correlations – The Essential Elements Subtitle: A Holistic Strategy for Discovering the Story of Your Data

Tinder uses behavioral analytics solution, Interana, to understand and explore how its tens of millions of users interact with the app. Generating massive volumes of event data from over 1.8 billion "swipes" per day and 26 million matches per day, Tinder is one of the most used social applications across the world. Gaining deep insights from its event data with Interana, Tinder is providing the best social discovery application to its users.

In this special guest feature, Luca Scagliarini, CEO of Expert System USA discusses the importance of using semantic technology for the effective management of unstructured information.

A common saying in sports is that once an athlete is in shape, winning is 90% mental and 10% ability. It takes mental toughness and determination to keep playing hard, especially if your team falls behind and your game strategy isn't working. Technical problem solving can also require a "mind over matter" approach. As Teradata 14 Certified Master Carrie Ballinger says, staying positive and maintaining a willingness to keep trying after a failure can help you persevere against challenges and get the answers you need. Ballinger practices meditation to quiet her mind, stay focused and be more accepting when she doesn't get the results she wants and needs to start over. Read her latest Tech Support column to see some of the technical solutions she has offered up for individuals who asked for her help in finding answers. Brett Martin Editor-in-Chief Teradata Magazine Related Posts 5 Components for Analytics in the Cloud A Data Lake Lets You Perform More Analytics on More Data The Art and Science of Creating Meaningful Customer Interactions  Find Hidden Opportunities With a Discovery Platform Advanced Technology in a Compact Appliance  The post Solve More Problems With A Positive Attitude  appeared first on Magazine Blog.

As data volumes grow, Big Data analysis continues to trend as a market topic and analytics continues to be a top priority for organizations around the globe. By providing students access to analytics solutions, SAP strives to support the growth of new ideas and new ways businesses can turn Big data into big value and…

InfoQ speaks with Brett Slotkin, senior staff software engineer at Google and author of Effective Python.

While many HR software vendors talk the talk of predicting "at risk employees," how many can prove they walk the walk, and that their predictions actually work? How can you ensure a vendor's claim to predict employee retention risks is valid? What should you look for?

Don't plan to fish in your personal data lake. Perhaps the biggest mess in all of IT is the management of individual consumers' data. Our electronic data is thoroughly scattered. Most…

Machine learning is gaining momentum thanks to bigger, more complex data sets. How does it work? Kimberly Nevala from SAS Best Practices explains what it is by focusing on what it isn't.

Tin Kadoic expands the definition of responsive beyond RWD, encouranges the audience to rethink how to design and develop responsive software. Tin shows examples of how intelligent design can improve industry-leading applications, how you can include Smart First approach into your UX and UI processes.

Evan Krall talks about Paasta, which is Yelp's platform for running services, built on Docker, Mesos, Marathon, SmartStack, git, and Jenkins.

Todd Montgomery challenges some of the common myths and misconceptions about high performance streaming data, and takes a look at what is really possible today.

Razorsight, a provider of cloud-based predictive analytics solutions, is using the MapR Distribution including Hadoop, along with Apache Spark, to take advantage of big data storage and compute. Razorsight evolved its technology stack with Hadoop from MapR to scale cost effectively and to generate valuable insights using data science and predictive analytics for communications service providers.

Natalia Chechina outlines features of actor and functional programming models, and the reason these models attract so much interest in parallel, concurrent, and scaling world. She talks about fault tolerance, its importance in large scale systems, and the approaches to implement it.

The 2015 Standish Group Chaos Report has been released which shows some improvement and lots of opportunity for improvement in the software development industry. Jennifer Lynch spoke to InfoQ about the findings and their implications for software development. A significant change in the survey approach this year is the expansion of the definition of success to explore outcomes., Stephane Wojewoda

Dana Pylayeva talks about the Agile game she designed combining Scrum, Lego and Chocolate. The game helps participants (in particular non-technical types) understand the difficulties and bottlenecks in application delivery and how DevOps and Continuous Delivery practices can help.

Joseph Blomstedt presents ongoing work to build a new set of high performance data structures for Erlang, including both single process data structures as well as various concurrent data structures.

Steve Green introduces SOLID principles with coding examples tailored for novice and intermediate developers.

Published Date: 2015-10-03 21:27:34 UTC Tags: Analytics, Big Data, Chief Analytics Officer, Chief Data Officer, Chief Digital Officer, Chief Innovation Officer, Chief Strategy Officer, CIO, Data Science, Deep Learning, Hadoop, Machine Learning, Predictive Analytics Title: Top Five Misconceptions About Predictive Analytics Subtitle: and how to overcome them

Jonathan Mills presents how to automate building tasks for JavaScript projects with Gulp.

David Harrison presents the API and culture journey at freelancer.com.

This entry was posted in News and tagged , , , , , , , , , . Bookmark the permalink.