Big Data News – 7 Oct 2015

Top Stories
You surely have listened the term "dark fiber" and wonder what it is about. The term refers to optical fiber infrastructure that hasn't been lit yet, meaning it is installed but it isn't being used. As it is known, fiber optic cables are made of thin strands of glass or plastic through which data moves as light pulses, caused by LEDs transmitters if it is multi-mode or a much more sophisticated technology if it is single-mode, so dark fiber is called like that because it isn't active and there are not light pulses traveling through it.

Not all data is created equal. One of the first questions faced by enterprises while deciding their big data strategy is how and where will they store this data. The rush to this decision often underscores the need to understand what should be stored or thrown away and how will the data be processed to generate value.

Later this year IBM will complete its acquisition of Cleversafe to add an object-based storage system for use on-premises and in the cloud.

As organizations incorporate data scientists' insights into their corporate strategies, they must harness these findings within a sustainable and practical approach to governance–and make sure data scientists are working within this system of governance.

When you're a hammer, everything looks like a nail. When you're a networking company, like Cisco, everything looks like it should be connected.

Big data has been a common phrase in the tech industry as companies of all types collect staggering amounts of customer. Data is also a high-profile topic as recent security breaches that leaked sensitive personally identifiable information to the public. Aside from the security issues of protecting all that data, companies are quickly finding that, while they can collect massive amounts of information, it's another thing to actually organize and analyze it. Data isn't relevant only to IT departments, either.

The European Court of Justice has invalidated the Safe Harbor Framework as a way to comply with EU data laws.

IBM announced it has entered into a definitive agreement to acquire Cleversafe, an object-based storage in the cloud company distinguished by its unique cloud storage algorithms designed to efficiently compress massive amounts of data.

With the increasing sophistication of fraud attacks, businesses that don't take fraud seriously do so at their peril. Adopting four best practices can help organizations combat cyber attacks and protect their assets.

Couchbase's new SQL-compatible query language bridges the gap to JSON. Couchbase executives say this enables developers who are familiar with SQL and tools built for SQL, such as Excel, to work with NoSQL.

Penn Medicine's modern big data initiative's applications alert doctors of at-risk patients. Penn plans to take the platform open source to other healthcare providers next year.

Avik Partners is at the front end of a major shift toward relying on machine learning algorithms to reduce the complexity of managing IT environments.

This is the first in a series of articles that will show how to build peer feedback loops, an effective means to encourage a culture of continuous improvement. Starting with a problem statement and some background on feedback, followed by explaining why metrics and meetings are not enough, the article describes the first three methods on how to design and facilitate peer feedback sessions.

Version 4.0 of Couchbase Server, a NoSQL document store that competes with the likes of MongoDB and Cassandra, is now available. Its main mission is to make working with unstructured data (namely JSON) as easy as querying more conventional row-and-column databases. Couchbase is following the success of projects like Spark to provide easy-to-program tools that obtain quick, actionable results. In Couchbase's eyes, SQL is still one of the best ways to accomplish the task. Andrew C. Oliver answers the question on everyone's mind: Which freaking database should I use?

Information governance practices must be updated as laws, technologies, and business models change. Here are seven ways to make sure you're governing your data effectively.

I'm very excited about Teradata's latest cloud announcement — that Teradata Database, the market's leading data warehousing and analytic solution, is being made available for cloud deployment on Amazon Web Services (AWS) to support production workloads. Teradata Database on AWS will be offered on a variety of multi-terabyte EC2 (Elastic Cloud Compute) instances in supported AWS regions via a listing in the AWS Marketplace. I believe the news is significant because it illustrates a fundamentally new deployment option for Teradata Database, which has long been the industry's most respected engine for production analytics.

Couchbase today announced general availability of Couchbase Server 4.0, a new release that enables developers to build a much broader variety of Web, mobile and IoT applications on Couchbase. The release delivers new levels of developer agility, enterprise application scalability and performance, and business insight from data stored in Couchbase.

Couchbase and MongoDB have been going head to head in the world of document-oriented NoSQL databases for some time. With today's launch of Couchbase Server 4.0, the company has caught up to and even surpassed some of MongoDB's capabilities, according to Couchbase CEO Bob Weiderhold. According to Weiderhold, Couchbase has built a solid reputation with relatively simple data serving needs, such as profile and user session management in Web app or a product catalogs, which perhaps are more affiliated with a basic key-value store.

It's been exciting to kick off our first Big Data & Brews East Coast at Strata + Hadoop World New York! Not only that, but I had the opportunity to meet with my good friend Tony Baer,…

Pivotal Software has turned over its SQL-on-Hadoop engine along with its MADlib machine-learning tool to the open source community as it seeks to extend the reach of its interactive SQL engine deeper into the Hadoop ecosystem. Pivotal, which announced in April it was teaming with Hortonworks by combining its big data suite with its partner's Hadoop platform, said recently its contributing HAWQ engine and MADlib framework to the Apache Software Foundation, giving each "incubation" status within the open source group.

Cloudera, a leader in enterprise analytic data management powered by Apache Hadoop™, announced the public beta release of Kudu, a new columnar store for Hadoop that enables the powerful combination of fast analytics on fast data.

Today's Reality Recently my GRC colleagues at SAP and I were discussing the Three Lines of Defense (TLoD) . Many of us come from professional or consulting backgrounds and have implemented initiatives that looked on the surface very much like what the Three Lines of Defense model is advocating today. Suggesting that operating management, risk…




I recently had the pleasure of visiting with Arvind Battula, Sr. Data Scientist at Schlumberger. We discussed his background as a chemical and mechanical engineer and his move onto the Data and Analytics team as a data scientist. The following is a transcript of my conversation with Arvind. We discussed his background, his interesting focus areas for data science in oil and gas, and technologies that he believes will help transform the industry. Kohlleffel: Arvind, you entered the data science world recently on the Schlumberger Data and Analytics team and have a very interesting background coming from both chemical engineering and mechanical engineering disciplines. Tell me about your experience and engineering background. Battula: Certainly, my background is diverse.

Google has announced they have started to roll out Android 6.0, codenamed Marshmallow, to Nexus devices. It is not yet entirely clear, though, when Marshmallow will become available on other devices.

User-friendly analytics tools with graphical presentation tools plus big data solutions and the ability to more easily search and track fiscal anomalies give financial crime fighters a much stronger arsenal in their ongoing battle against white collar crime. These offenses costs taxpayers up to $600 billion annually, but the complexities of these cases mean many financial scammers elude justice for years, if not forever. That could — and should — be changing, however, as investigators leverage powerful databases, big data, and analytics to find and expose schemes, then use this information to empower juries to follow the money trail. The opposite, however, appears to be true.

It sounds strange now, but it was at least a couple of years from the moment I released the first open source version of Redis, to the moment I started to hear "enterprise" in the context of Redis. I first realized how important Redis had become in the context of big companies when I was in Cologne at the NoSQL Matters conference in 2012. Within hours I was approached by a number of important companies, including ones operating in the financial market, that told me stories about how they were using Redis, sometimes without full approval from their superiors since it was lacking commercial support and the boss was afraid to provide the green light. At some point one engineer said, "Even though it hadn't been approved, Redis was the only way we had to deal with the increasing traffic in this service, so we continued to use it." It's fair to say that Redis found its way inside the enterprise without any need from my side to push its adoption.

Leveraging the rise of JSON, Couchbase is in a contest with other providers of document databases.

If you're still saying, "Big data isn't relevant to my company," you're missing the boat. I firmly believe that big data and its implications will affect every single business — from Fortune 500 enterprises to mom and pop companies– and change how we do business, inside and out.

by Jens Carl Streibig, Professor Emeritus at University of Copenhagen Editor's introduction: for background on the miniCRAN package, see our previous blog posts: Introducing miniCRAN Using R in Myanmar MiniCRAN saves my neck when out in regions where seamless running internet is and exception rather than the rule. R is definitely the programme to offer universities and research institutions in agriculture because it is open source, no money involved, and the help, although sometimes a bit nerdy, is easy to access.

In the quest to create an ever-smarter iPhone, Apple has acquired Perceptio, which could make Siri faster and more effective.

Published Date: 2015-10-06 14:51:21 UTC Tags: Analytics, Big Data, Data Science, Deep Learning, Technology Title: Is The World Ready To Embrace Deep Learning? Subtitle: Is it sensible or perilous?

Tomorrow Forrester will host our Geneva-based clients for a breakfast meeting and discussion on "Powering Innovation Strategies with Insights." My colleague, Luca Paderni, will kick off the morning…

The safety of personally identifiable information (PII) is becoming more precarious as criminals find ways to profit from scams. However, insurance agencies are fighting back by using big data more effectively to uncover nefarious schemes.

Today's CRM and Master Data Management (MDM) technologies don't enable complete customer knowledge. In fact, they unwittingly turn customer focus into customer tunnel vision. We need an epistemic graph database – a context-aware master graph – to make possible richer, fuller customer stories and expanded 360-degree views for total awareness.

Making the most of the Internet of Things requires the smooth integration of DevOps and continuous engineering practices. Companies that adopt this strategy will unlock unique IoT possibilities for their organizations.

Many of us are familiar with Apache Hadoop MapReduce, but what do we know about Apache Spark? More importantly, why should your enterprise learn more about the Spark execution framework? …

My phone buzzed, as it had all day, with tornado warnings and natural disaster alerts from the weather station serving towns over an hour away. I almost ignored it, but I was glad I glanced down. A twister was headed right toward my home. The house was unscathed, but others in the area were not so lucky. The New York Times reported that this particular tornado outbreak in North Carolina killed 23 people that day, injured 130 and damaged or leveled hundreds of homes.

Amazon has published the AWS Well-Architected Framework (PDF), a guide for architecting solutions for AWS, with design principles that apply to systems running on AWS or other clouds.

Intelligence on mass media audiences was founded on representative statistical samples, analysed by statisticians at the market departments of media corporations. The techniques for aggregating user data in the age of pervasive and ubiquitous personal media (e.g. laptops, smartphones, credit cards/swipe cards and radio-frequency identification) build on large aggregates of information (Big Data) analysed by algorithms that transform data into commodities.

This entry was posted in News and tagged , , , , , , , . Bookmark the permalink.