Big Data News – 13 Jun 2016

Featured Article
The way Joe Biden tells it, the fight against cancer is in great measure a big data problem. In remarks at an oncology convention in Chicago this week, the vice president delivered a message of open data and interdisciplinary collaboration as keys in the search for better cancer diagnosis and treatment. [ Related: Big data essential to cancer moonshot ] Biden took the occasion to announce the public availability of the Genomic Data Commons, a repository of the anonymized genomic and clinical data of some 12,000 cancer patients that will open the door for researchers to analyze a broad collection of tumor genome sequences.

Top Stories
I have something of a checkered past with HPE, the enterprise business that was created when Hewlett-Packard split itself into two units a year or so ago. I first engaged with the business formerly known as HP half a dozen years ago. Back then the company was very much still in the business of selling hardware — be it servers, printers or laptops — and had not yet focused on its transformation into a business that could compete in the modern world. I spent a few years in the HP orbit, attending a number of their global user conferences and generally saying things as I saw them.

A bill to give email and other documents stored in the cloud new protections from government searches may be dead in the U.S. Senate over a proposed amendment to expand the FBI's surveillance powers. The Electronic Communications Privacy Act Amendments Act would require law enforcement agencies to get court-ordered warrants to search email and other data stored with third parties for longer than six months. Under U.S. law, police need warrants to get their hands on paper files in a suspect's home or office and on electronic files stored on his computer or in the cloud for less than 180 days.

Last month, the US and German SAP user communities released a white paper on Digital Transformation: The SAP User Community Perspective.

FormationOne enables IT organizations to identify and redeploy storage stranded in a virtual server or hyperconverged system.

I'm speaking at DAMA in Bloomfield, CT. It's free and open to the public. Title: Enabling Self-Service BI and Analytics When: Friday June 17, 1:30 Where: CIGNA University Learning Facility, 1350 Hall Boulevard/Rte 218, Bloomfield, CT 06002 (link to Google map) Preregistration: Just email your first and last name and Fri 6/17 Hartford to damabostonprereg@yahoo.com by 9pm, the day before the meeting. If it turns out you can't make the meeting, there's no need to cancel your preregistration.

Author and leadership consultant Dana W. White talks about her experiences as a woman working in male-dominated industries.

Computer scientists from Google and two universities have found a way to make quantum computing more practical than previously possible using a system from Canadian company D-Wave.

In this special guest feature, James Pita, PhD, Armorway's Co-Founder and Chief Evangelist, discusses cognitive analytics and its use for threat detection and deterrence.

Above the Trend Line: machine learning industry rumor central, is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items such as people movements, funding news, financial results, industry alignments, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

For robots to navigate the world, they need to be able to make reasonable assumptions about their surroundings and what might happen during a sequence of events. One way that humans come to learn these things is through sound. For infants, poking and prodding objects is not just fun; some studies suggest that it's actually how they develop an intuitive theory of physics. Could it be that we can get machines to learn the same way?

It was only 15 years ago that Microsoft's then-CEO likened open source software licenses to "a cancer." But today, Redmond is joining other tech giants in donating intellectual property to the hive mind of open source. Here's why it matters.

It has always had trouble getting customers to buy into its cloud, but the scope of the problem may have been badly underestimated.

Find out a little bit more about this open source big data tool.

Our roundup of intriguing new products from companies such as Intel and Webscale.

Hadoop, named after a toy elephant that belonged to the child of one its inventors, is an open-source software framework. It is capable of storing colossal amounts of data and handling massive applications and jobs endlessly. Hadoop's capabilities make it one of the most sought after data platforms for successful businesses all over the world. Hadoop Benefits Because it can store and quickly process any type of data, Hadoop is lightyears ahead of the game in the open-source world.

Data visualisation is the process of displaying data in visual formats such as charts, graphs or maps. This method is commonly used to grasp more meaning out of a snapshot of data which in other approaches might require sorting through piles of spreadsheets and great quantity of reports. With the amount of data growing rapidly it is more important than ever to interpret all of this information correctly and quickly to make well-informed business decisions. Graph visualisation has a fairly similar approach just with a more diverse and complex sets of data.

The use of Big Data in today's world has become a necessity due to the massive number of technologies developed recently that keeps on providing us with data such as sensors, surveillance system and even smart…

Talend, a global leader in big data and cloud integration solutions, announced that it has achieved status as an Advanced Technology Partner in the Amazon Web Services (AWS) Partner Network (APN).




Microsoft, Databricks, and Intel were among those lining up in support of Spark at this week's Spark Summit. We've got the full story in our Big Data Roundup for the week ending June 12, 2016.

Couchbase, provider of the database for the Digital Economy, announced the general availability of a new Couchbase Spark Connector. This integration joins two of the most scalable and best performing big data technologies to make analytics easier, faster and more productive.

Today nobody is bothered about if you need to go for Digital Transformation but when and how you are doing it. All new programs or initiatives should now take a 'Digital First' approach, with the…

Datos IO, a provider of next-generation data protection solutions, announced the general availability of RecoverX, scale-out data protection software for third platform applications and distributed and cloud databases.

SQL server databases have multiple users accessing and viewing data. It gives administrator extra concern regarding maintenance of security.

Splice Machine announced that its relational database management system, the first dual-engine RDBMS powered by Hadoop and Spark, is moving to open source.

In this special guest feature, Mika Javanainen, Senior Director of Product Management at M-Files Corporation, observes that with the plethora of new data-gathering tools that are continually coming out, more and more data is being collected – and we're reaching a point where a significant amount of it is going unused. This so-called "Dark Data" isn't being analyzed, and potentially useful trends are being missed.

Although vendor-written, this contributed piece does not promote a product or service and has been edited and approved by Network World editors. The As-a-Service model promises to reshape business service delivery to provide companies with plug-in, scalable, consumption-based services, but the opportunities are vastly untapped.

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.  NYU Stern Masters in Business Analytics Business Analytics is the intersection of business and data science, offering new opportunities for a competitive advantage.

This week's news includes big numbers for LTE, congressional bills and flying cars.

Is your data driving your daily customer interactions? Don't ignore what analytics can be doing for you in the now. Instead, discover how arming yourself with data from the outset can help you meet customers where they are.

Organizational inertia and seemingly endless processes and procedures can be barriers to innovation and the timely decisions that help businesses progress. Lisa Bodell, award-winning author, innovation expert and CEO of futurethink, offers a provocative manifesto for a change revolution leveraging the power of analytics to not just provide answers, but to also help people ask questions they hadn't thought of before.

Who remembers the 1983 movie starring Matthew Broderick titled "War Games"? This was one of the first movies that piqued my interest in computer security. In those days it was the use of acoustic couplers to dial a remote computer and we could actually communicate with a remote pc which was fascinating for me. Before long it was dial-up modems where I could literally dial anywhere I wanted to because it was attached via a serial cable to my pc. And yes such power at my fingertips meant I did perform some ethical hacking as a form of learning the inner workings of computer security and also the fact that I thought I was Matthew Broderick too!

One of the consequences of the hype and exaggeration that surrounds Big Data is that the labels and definitions that we use to describe the field quickly become overloaded. One of the Big Data concepts that presently we risk over-loading to the point of complete abstraction is the "Data Lake". Data Lake discussions are everywhere right now; to read some of these commentaries, the Data Lake is almost the prototypical use-case for the Hadoop technology stack.

Hybrid clouds provide flexibility and control, but they introduce more complexity and integration challenges than all-public, all-private solutions.

Intelligent, often autonomous, systems are making headway in the datacenter as developers seek to offload manual processes and move beyond traditional approaches like prescriptive IT automation as they struggle to keep up. That's the conclusion of a vendor-backed survey of IT executives about the adoption of intelligent machines and systems based on new AI approaches and other expert systems.

We asked five social influencers how data scientists can collaborate to build better business applications. See what they had to say.

With a fail-safe network in place, employees can always access the assets they need and you can rest easy knowing your network is robust.

Analyzing online activities can provide clues as to a person's chances of having cancer, Microsoft researchers showed in a paper published this week. Specifically, the researchers demonstrated that by analyzing web query logs they were able to identify internet users who had pancreatic cancer even before they'd been diagnosed. The study suggest that "low-cost, high-coverage surveillance systems" can be created to passively observe search behavior and to provide early warning for pancreatic cancer, and with extension of the methodology, for other challenging cancers," the researchers concluded.

At Hadoop Summit San Jose we are excited to be joined by experts in the Energy Industry. Here are just a few of the sessions focussed on Energy, be that Oil and Gas or Nuclear, but you need to register to attend Hadoop Summit. A Data Lake and a Data Lab to optimize operations and safety within… The post Thought Leaders Secrets from the Energy Industry appeared first on Hortonworks.

Encryption is a key security feature in Cloudera-powered enterprise data hubs (EDHs). This post explains some best practices for deployment of Cloudera Navigator Encrypt for that purpose. For those unfamiliar with the product, Cloudera Navigator Encrypt provides scalable, high-performance encryption for critical Apache Hadoop data. It utilizes industry-standard AES-256 encryption and provides a transparent layer between the application and filesystem. Cloudera Navigator Encrypt also includes process-based access controls, allowing authorized processes to access encrypted data while simultaneously preventing admins or super-users like root from accessing data that they don't need to see. The post Best Practices for Enterprise Data Hub Encryption appeared first on Cloudera Engineering Blog.

Randy George, an expert in web map applications, has been fascinated with computer graphics (especially maps) since the early '80s. For much of that time, he says, the technology for mapping has been…

Enterprise-scale database technology needs to be on its game when it comes to thwarting security threats and protecting data from loss because of disaster or system failure. Take a look at advanced security features built into IBM DB2 for Linux, Unix and Windows Version 11.1 that can help ensure data is well protected from both internal and external security threats.

More and more businesses are waking up to the threat of poor data quality. We're gradually seeing the risk being taken more seriously as the shock waves of poor management are felt. Yet for many businesses, data quality is seen as an abstract concept; difficult to understand, and impossible to value. When the business formulates its budgets for the year, data quality is often skipped over, because nobody really knows what's wrong. Sure: they can see emails bouncing, and their customers are drifting away to competitors, but the root cause hasn't been fully determined.

Sweeping technology upgrades for fans and employees have the professional golf group building Windows 10 applications and migrating email and other computing tasks to the public cloud.
Heroin addiction is fueling the crime rate in Manchester, New Hampshire, and its police department needed real-time data to determine where crimes were likely to occur to enable the department to proactively prevent those crimes from happening. Take a pictorial journey to see how the Manchester police force was able to achieve remarkable reductions in robberies, burglaries and motor vehicle thefts through the power of IBM Analytics.

What makes Apache Atlas different from other metadata solutions is that it is designed to ship with the platform where the data is stored. It is, in fact, a core component of the data platform.

Users worried about being caught up in the recent leak of more than 32 million Twitter login credentials should already know if they've been hacked. Twitter confirmed on Friday that it was notifying users whose valid login credentials were recently being passed around on the so-called 'dark web.' The account credential leak became public after LeakedSource published the collection on Wednesday. LeakedSource maintains a database of nearly two billion online account credential leaks. Twitter said in a blog post that it had obtained the leaked data, and matched it against their records.

Join us for a look at what's on the horizon in data analytics, discovering how a broad array of tools aims to change the way we do–and think about–data science.

The next time you run in the early hours of the day and then look up at your Nike-Fuelband, congratulate yourself. You are now a part of the digital ecosystem, where every second of human life is recorded, stored and analyzed.What about big data and the fascination to wearables?

This entry was posted in News and tagged , , , , , , , , , . Bookmark the permalink.