Big Data News – 14 Sep 2015

Top Stories
The Agile 2015 conference was held in August 2015 in Washington, DC. Rebecca Parsons and Phil Brock talk about the conference and the other programs the Agile Alliance has underway.

Small and medium sized companies have adopted the agile way of working in Greece and there are few examples of agile in larger organizations, interest in agile from the local industry is growing. Among the topic discussed in agile meetups are whether companies should implement Scrum or Kanban, Scrum for startups, dealing with fixed price and scope contracts, productivity, and happiness in teams.

At WWDC 2015, Apple introduced iOS 9. Although the new SDK does not introduce as many new or enhanced features as iOS 8, which included more than 4,000 new APIs, it does still provide a wealth of new functionality and enhancements. In this article, the first in a series focusing on iOS 9, we are going to review a number of new frameworks that Apple has included with its new mobile OS.

Version 3 of WebSharper, the F# framework for developing web applications hits RTM this year. We decided to catch up with Adam Granicz, CEO of IntelliFactory, to learn what new features and improvements WebSharper 3 brings.

MongoDB isn't the only company I reached out to recently for an update. Another is DataStax. I chatted mainly with Patrick McFadin, somebody with whom I've had strong consulting…

John Housego describes how W. L. Gore & Associates manages to maintain a global corporation without hierarchies, that keeps the bureaucracy as small as possible.

Sharad Murthy & Tony Ng present Pulsar, a real-time streaming system which can scale to millions of events per second with high availability and 4GL language support.

Matthew Renze introduces the R programming language and demonstrates how R can be used for exploratory data analysis.

Big Data and text analytics are all the rage in the field of law today. However, there is a dearth of precise definitions of these technologies and of the understanding of what they do. This is the void that the present article will try to fill.

This post talks about how banks have implemented Agile.

In this article, author discusses how to design an Internet-connected garage door opener ("IoT opener") to be secure. He talks about cloud service authentication and security improvements offered by networked openers, like two-factor authentication (2FA). He also discusses security infrastructure for IoT devices, which includes user authentication, access policy creation & enforcement.

James Governor from RedMonk has written about how immutable infrastructure approaches are applicable to microservices. In his view, all microservices must be immutable and developers will observe the same benefits which others are already seeing in lower layers of the software stack.

Joshua Seckel, Bill Krebs and Manjit Singh are three agile coaches who have gone through the ICAgile assessment and been certified as Experts in Facilitation and Coaching. They discuss the process, what it feels like to be reviewed and assessed by a panel of their peers and how the assessment differs from other certifications.

Maurice Naftalin explains uses for lambdas in Java, how streams work in in Java 8, parallel streams and threading, side effects, and much more.

Wesley Reisz explores Android Wear, providing practical ways to introduce wearables into your mobile strategy and exercising the Android Wear API through a demo.

Nathan Peterson introduces Behavior Driven Development, showcasing its adoption by his team along with successes and failures using it.

Simon Thompson shows how Wrangler can help with making systems run on multi-core hardware, including three Wrangler refactoring techniques for retrofitting concurrency to Erlang applications.

Many companies investing in advanced talent analytics are seeing the payoff, according to High-Impact Talent Analytics: Building a World-Class HR Measurement and Analytics Function, a Bersin by Deloitte study. Based on a survey of 436 North American companies, the study reveals that advanced talent analytics is helping achieve better talent outcomes in terms of leadership…

Do you remember Captain James Kirk using his wrist watch to communicate with the crew of the Starship Enterprise back in 1966? Today, almost after 50 years, it has finally become a reality! Digital…

The Omni Parker House in downtown Boston has been home to some great stories. Charles Dickens gave his first American reading of A Christmas Carol there in 1867, and Jack Kennedy proposed to Jackie…

Authors: David L. Cox and Jon Neiditz Much already has been written about the August 24, 2015 ruling by the […]

The problem with searching for a needle in a haystack is that the process, by nature, is inefficient. So why has it become a popular analogy for analytics efforts within the enterprise? Because today's analytics attempts — particularly for unstructured human data — are typically a mess. Today's analytics are often ad hoc and rely on incomplete or skewed sample sets.

"Make data matter." This is the mission, both internal and external, of Pinsight Media+. It's an ongoing reality that many companies are finding themselves in possession of a wealth of data. Pinsight…

Email Compliance: How Analytics Helps Stave Off Violations Live Webinar Wednesday, September 23, 2015 10:00 AM PDT/1:00 PM EDT Duration: 1 hour REGISTER NOW With over 100 billion business emails in circulationdaily, the criticality of keeping your messages compliant can't be overstated. To avoid litigation, steep financial penalties, and human resource issues — and to keep the integrity of your brand intact — it's important for organizations to ensure their communications abide by established rules and protocols.

Teradata 2015 PARTNERS — Where Smart People and Technology Create Business Value The Teradata 2015 PARTNERS Conference and Expo is the preeminent global conference dedicated to data. Marking its 30th anniversary this year, it's the one conference for the entire organization, with more than 200 educational sessions for every role from business to marketing to IT. Come to PARTNERS to network and learn best practices to help you improve your business landscape with new data strategies. LEARN MORE Customer Led, Customer Focused Event At PARTNERS you'll be learning from top data practitioners at leading organizations such asTarget, Southwest Airlines, Wells Fargo, Nationwide Insurance, and eBay.

The Internet of Things offers great potential in many fields, but healthcare can benefit in ways that affect everyone. An upcoming webcast describes how IoT-related technologies and streaming analytics can help healthcare providers reduce costs and improve patient outcomes.

Published Date: 2015-09-11 14:14:13 UTC Tags: Analytics, Applications, Big Data Title: Fuzzy Data matching Subtitle: Entity resolution using Apache Spark and Machine Learning

Despite being the go-to tool for budgets, forecasts and planning, spreadsheets are unfortunately often prone to errors, primarily because of the volume of manual processes the tool requires. Three noteworthy examples from the past decade illustrate how damaging spreadsheet errors can be. By learning from these examples, organizations can learn ways to overcome the shortcomings of spreadsheet approaches used in enterprises for bugeting, forecasting and planning.

In June, Attorney General Loretta Lynch announced "the largest criminal health care fraud takedown in the history of the Department of Justice." More than 240 people were arrested and charged with stealing $712 million from Medicare. The suspects included 46 doctors, nurses, pharmacy owners and other medical professionals. While this is a huge blow to…

by Andrie de Vries Next week the 2nd EARL (Effective Applications of the R Language) conference starts in London, from Monday 14th to Wednesday 16th of September. Last year's inaugural event was a…

In the coming decade, massive amounts of data from the Internet of Things will generate huge business potential as well. Capitalizing on this opportunity requires the effective, efficient technological solutions being discussed and showcased at the upcoming IoT Solutions World Congress.

Bluewolf consulting has released its fourth customer feedback report on Salesforce. It finds that the use of multiple clouds is increasing.

Bluewolf consulting has released its fourth customer feedback report on Salesforce. It finds that the use of multiple clouds is increasing.

Through the Internet of Things, car insurers now have the ability to track drivers' behaviors in real time. While this method of determining premiums has great potential, there are still outstanding issues that must be addressed before it can be adopted throughout the industry.

So many data scientists select an analytic technique in hopes of achieving a magical solution, but in the end, the solution simply may not even be possible due to other limiting factors. It is important for organizations working with analytic capabilities to understand the various constraints of implementation most real-world applications will encounter.

Digital marketers are relying on big data and Internet of things technology to gain better clarity into customers and to power customer-facing apps.

The arts and acts of counterfeiting have many manifestations, each with its peculiarities, motivations and modes. There is the well-known counterfeiting of money, precious metals and art, and the lesser appreciated counterfeiting of government bonds, documents, certification, medication and consumer goods. There is also a hugely under-appreciated side-line in the production, distribution and certification of counterfeit data, bogus information and knowledge-free knowledge.

Rackspace and Intel have joined forces to build the new OpenStack Innovation Center located on the campus of Rackspace's San Antonio headquarters. The idea to build an OpenStack Innovation Center had been in the works for quite sometime now.

IBM's acquisition streak continues with the buyout of StrongLoop, which boosts Big Blue's Node.js portfolio of services and support.

IBM's acquisition streak continues with the buyout of StrongLoop, which boosts Big Blue's Node.js portfolio of services and support.

Hadoop has never been in more desperate need of a lift than now. Though Hadoop has been synonymous with MapReduce for years, even ardent backers like Cloudera are abandoning MapReduce for its sexier, cooler cousin, Apache Spark.

Like most of their enterprise colleagues, the sales department wants faster access to data. But a new survey of how sales teams increasingly rely on data finds that more than half are unable to access it in real time. The survey of sales operations released Thursday (Sept. 10) by Domo, the business management platform vendor, found that two-thirds of those sales executives polled said they lack real-time access to data. Roughly the same percentage said they it takes too long to generate sales leads and other analytical insights from their data.

The feasibility of running transactional workloads side by side with analytical workloads on Hadoop got a little clearer this summer when Hewlett-Packard spun out Trafodion, a "webscale SQL-on-Hadoop" solution. The new company behind Trafodion, called Esgyn, has a simple yet bold goal: unite analytic and transactional workloads onto a single SQL-loving Hadoop platform. Transactional and analytical workloads have different characteristics, by definition.

Published Date: 2015-09-10 20:07:25 UTC Tags: Innovation, Mobile, Predictive Analytics, Technology Title: How to stop a hurricane Subtitle: AKA prepare for your travels and be safe

We find ourselves at a seminal moment in the evolution of information technology: The stuff around us, even attached physically to us, is getting its own brain, and voice. Thanks to the ever-steady progression of Moore's Law, as applied not only to microprocessor power but also to memory, storage and bandwidth – specifically, wireless bandwidth – plus other key commercial factors, we're seeing almost anything and everything now capable of telling its own story. Generally, this is being lumped together as the Internet of Things (IoT), and we're clearly in love with it. Couple it with cloud computing, big data analytics and a few other already ubiquitous buzzwords, and you're able to imagine the day when your healthcare provider conspires with your refrigerator and treadmill to adjust insurance premiums based upon your lifestyle. And don't forget the coming of self-driving cars, all the rage.

The significant shifts in the retail industry over the last several years are certainly no secret. Industry dynamics have given rise to disruptive innovation while knocking established brands asunder. Through lift analytics, real-time product affinity analyses that identify products often sold together give today's retailers and marketers a powerful tool for understanding customer demand.

In the healthcare industry, "big data analytics" is a term that can encompass nearly everything that is done to a piece of information once it begins its digital life. From flagging drug interactions to predicting sepsis, modeling emergency department use to triggering an automated phone call for a mammogram reminder, healthcare providers are leveraging patient…

by Joseph Rickert We are very pleased to announce that Microsoft will not only continue the Revolution Analytics' tradition of supporting R user groups worldwide, but is expanding the scope of the…

IBM today opened its new IBM Watson Health global headquarters in Cambridge, Mass. It also announced new major partnerships with healthcare organizations like Boston Children's Hospital. IBM also announced the expansion of its Watson solution portfolio with IBM Watson Health Cloud for Life Sciences Compliance and IBM Watson Care Manager. At the event, Big Blue named Deborah DiSanzo, formerly CEO of Philips Healthcare, as the IBM Watson Health business unit's first general manager. DiSanzo will be based in the new Cambridge headquarters and will report to Michael Rhodin, senior vice president, IBM Watson Group. IBM Watson Deborah DiSanzo is the first general manager of IBM Watson Health.

In the 21st century, big data has become the new science, and businesses are trying to make heads and tails of it all. A new and innovative startup, Dataiku, develops software to help data scientists and analysts create impactful value from all that raw data, quickly. Big Data analytics is so popular as a concept right now that some Fortune 500 companies are bombarding the airwaves with commercials touting their capabilities. But for most, hiring the services of IBM Watson to bring sanity and strategic value to their data is beyond their means.

Have you ever contemplated the potential business value of data? Do you wonder where your most valuable data might be? Do you speculate about the data that will help you sleep at night and keep the wolves from the door? If you have answered yes to any of the questions then my advice to you is to follow the money. We can learn a lot from data that follows the money, and we can use money-centric data to not only control our operations and govern our tactics, but also to understand ourselves and others, to help us to put the important aspects into a realistic business context, and to formulate coherent, cohesive and realizable business strategies as a result.

With a growing number of long-time employees retiring, the "brain drain" of facilities management knowledge has become a critical concern for many businesses. An integrated workplace management system not only helps organizations keep this vital information, but offers real business benefits to the bottom line as well.

You're probably familiar with some impressive predictive analytics success stories. But how does predictive analytics actually work? This primer explains how you can use predictive analytics to boost the power and value of your most important data.

If you've followed the news recently, I don't need to tell you that cyber security is a topic of major importance today. It seems that every week there is another revelation of a security breach at an organization thought by many to be a leader in data and network security.

Cloud security is such a burning issue because data is the gold and the oil of any modern organization. Enterprises large and small need to think about how they can prevent hackers from compromising their Cloud services. Yet not all attacks occur from outside of an organization.

One pleasure in talking with my clients at MongoDB is that few things are NDA. So let's start with some numbers: >2,000 named customers, the vast majority of which are unique organizations…

Vint Cerf, recognized as one of the fathers of the Internet, is using social media to generate new ideas about how the Web should evolve.

Apache Spark, the in-memory data processing framework nominally associated with Hadoop, has hit version 1.5. This time around, improvements include the addition of data-processing features and a speed boost, as well as changes designed to remove bottlenecks to Spark's performance stemming from its dependencies on the JVM. [ Download the InfoWorld quick guide: Learn to crunch big data with R. | Sign up for InfoWorld's Big Data Report to stay atop all the latest news and developments in the field. ] With one major Hadoop vendor preparing to ditch MapReduce for Spark, the pressure's on to speed up both Spark's native performance and its development.

Imagine an agribusiness that never had access to a modern commodities market and had never used any real information technology. Suddenly, it had the opportunity to leap into high tech and potentially enjoy whole new economies of scale — a pure greenfield proposition. What would happen? With new international and state legalization, as well as a current moratorium on federal enforcement, marijuana is that agribusiness.

As discussed in the previous sections, there are several standards and interoperability frameworks available. Most of them are infrastructure related. The standards and frameworks can generally be clustered into 3 groups. The first group is the "Standards" group, which consists of OCCI an the DMTF standards. The second group is the "Middleware" group.

From a database standpoint, transactional applications primarily involve inserting and updating data. As a result, the data structure needs to be designed to eliminate, or at least minimize, redundant records; the goal of which is to ensure that inserts and modifications are processed only once, which can help boost performance and also avoid data inconsistencies.

While Go 1.5 is still relatively new on the blocks, the Go team is already at work on improving its new, low-pause, concurrent garbage collector, which aims to make Go better suited for new application fields, Google engineers Austin Clements and Rick Hudson say.

Analytics and the cloud increasingly go hand in hand, and OpenText has provided the latest example with its Big Data Analytics service announced on Wednesday. Early this year, the Canadian company acquired analytics-focused Actuate and pledged to embed the company's technology within its own offerings. Now, promising high-performance data storage, prebuilt algorithms and tailored professional services, OpenText's new cloud service aims to provide an all-in-one analytics tool to help business users access, blend, explore and analyze their big data without having to rely on IT for help. Key features include a built-in, high-speed analytics columnar database that provides performance as much as a thousand times faster than that of traditional relational databases, according to OpenText.

Like many national companies, scale is important for Yellow Pages, or YP as the company is now known. So when the Georgia-based local marketing firm set out to find a suitable SQL engine to deliver real-time analytics atop its 1,000-node Hadoop cluster, performance at scale was a prime directive. It wasn't long ago that the yellow pages were the go-to directory for Americans. You could find them in every household and in public phone booths (remember those?) Today, most directory services have migrated to the Web, and YP Holdings, the parent company of Yellow Pages, has kept up with the times. Every month YP Holdings attracts about 70 million unique users to its website or its mobile apps.

Ah, the good old days. The world used to be simple. ETL vendors provided data integration functionality, DBMS vendors data warehouse platforms and BI vendors concentrated on reporting, analysis and…

As the National Football League gears up for its 96th season, it is following in the footsteps of other major sports leagues by embracing emerging data technologies to provide on-field player tracking. The NFL's take on big data involves converting high-speed data on each player's position on the field and converting it into real-time statistics. Those stats are increasingly being used by entrants in football fantasy leagues which themselves are becoming an industry segment (ads for fantasy football and other leagues seemingly appear non-stop on sports cable networks).

C-suite executives in mid-market firms are increasingly involving themselves in technology decision-making, and they are focusing on cloud and analytics, according to a new report by Deloitte Growth Enterprise Services. "In the middle market, technology really has become a C-suite issue," says Stephen Keathley, national technology leader of Deloitte Growth Enterprise Services and principal of Deloitte Consulting. "The numbers are way up for executives that are actively involved in their company's technology decisions."

The Predictive Analytics Innovation Summit returns to Chicago on November 11 & 12 at the Hyatt Regency McCormick Place. Check out the schedule here: With our Early Bird rate expiring next Friday it's now the time to secure a pass at the best rate possible. Exclusively for group members, the code 'AB100' entitles you to an extra $100 off two-day passes. This summit will gather the industry's leading analytics executives to offer insights through keynote presentations, dynamic workshops and interactive panel sessions.

The excitement over "big data" has grown dramatically. But what is the value, the function, the purpose? The most actionable win to be gained from data is prediction. This is achieved by analytically learning from data how to render predictions for each individual. Such predictions drive more effectively the millions of operational decisions that organizations…

EDBI Leads $18 Million Round with Existing Paxata Investors and Industry Leaders Redwood City, CA — September 9, 2015 — Paxata, provider of the only Adaptive Data Preparation™ platform for the enterprise, today announced significant momentum and Series C funding, which comes on the heels of announcing 400 percent year-over-year revenue growth, key leadership team hires and their enterprise-grade Summer '15 release. The latest funding round was led by EDBI, the corporate investment arm of the Singapore Economic Development Board, along with existing investors, Accel Partners India, Walden-Riverwood Ventures and Toba Capital, and industry leaders including Sanju K. Bansal, Co-founder and former COO of MicroStrategy.

ANN ARBOR-A new way of computing could lead to immediate advances in aerodynamics, climate science, cosmology, materials science and cardiovascular research. The National Science Foundation will provide $2.42 million to develop a unique facility for refining complex, physics-based computer models with big data techniques at the University of Michigan. The university will provide an additional $1.04 million. The focal point of the project will be a new computing resource, called ConFlux, which is designed to enable supercomputer simulations to interface with large datasets while running.

Dear DSC Member, We'd like to invite you to participate in Round 1 of the Big Data Analytics World Championships 2015 (Business and Enterprise) on Saturday September 25, 2015. The current World Champion is Stephane Sbizzera (KPMG France). Thousands of the best Data Scientists, Engineers, Statisticians, Computer Science and Data Analysts compete in two Online Qualification Rounds (4 hours each). The top performers are flown to Austin, Texas USA to compete in the Live World Finals.

AT&T and ZTE are offering a new device called Mobley, which can turn almost any car into a WiFi hotspot.

Originally posted on Data Science Central The emerging "Data Stack" or "Data Layer" is in full transition and can be viewed and defined many different ways. The ability to capture, analyze and learn from data generated at unprecedented scale, combined with means to access that information, on demand, when relevant, creates business opportunities we are only just beginning to appreciate. One way simply defines data in a three layer stack: Internal Data: The data gathered into a data warehouse from the transactional systems of a company.

Published Date: 2015-09-09 16:34:50 UTC Tags: Analytics, Big Data, Data Science, Data Warehousing, Open Data Title: Data Mining In The Deep Web Subtitle: Holding a huge amount of data, but being hidden, can you mine it?

We're please to announce that Softdate version 4.5 is available now. The Softdate Date and Time Simulation (DTS) Data Protection Suite allows you to simply and easily test date and time logic…

After an informative presentation by Armon Dadgar at QCon New York that explored security requirements within modern production systems, InfoQ sat down with Dadgar and asked questions about HashiCorp's Vault, an open source tool for managing secrets at scale.

As the Internet of Things (IoT) takes off, you'll hear more about "liquification." This trend involves physical assets becoming participants in real-time global digital markets-enabling industries as disparate as real estate, risk management and agriculture to benefit from IoT efficiencies.

Originally posted on Data Science Central Click on the image for full view It was not easy to select a few out of many Open Source projects. My objective was to choose the ones that fit Big Data's needs most. What has changed in the world of Open Source is that the big players have become stakeholders; IBM's alliance with Cloud Foundry, Microsoft providing a development platform for Hadoop, Dell's Open Stack-Powered Cloud Solution, VMware and EMC partnering on Cloud, Oracle releasing its NoSql database as Open Source. "If you can't beat them, join them".

The Microsoft Azure Event Hubs messaging service processed approximately 150 terabytes and 30 billion messages per day, or 375 000 messages per second, in June 2015, according to the Microsoft Azure Service Bus product team.

Published Date: 2015-09-09 14:01:13 UTC Tags: Analytics, Chief Data Officer, Data Science, Data Warehousing, Open Data Title: Beginners Guide To: Data Democratization Subtitle: What is data democratization and should you be doing it?

Originally posted on Big Data News With the growing amount of information being created daily, big data is providing marketers a means to find consumers who are in-market for the products or services your company sells. However, since this data is growing so rapidly and is stored in so many places (i.e. social networks, blogs, forums, public records, databases), it is near impossible for marketers to quickly find all the data they need on in-market prospects without expertise and tools. I created an infographic that briefly explains how leading marketers are using a service such as Data-as-a-Service (DaaS) to find and target prospects who are currently in-market for products and services their company sells. You can view the infographic below: Learn more.

I remember reading an MIT paper on manufacturing technology trends a couple of years ago. It had a fascinating mention of "Additive Manufacturing" (AM) and how it could be a game changer. One of the biggest challenges facing automotive, aerospace and defense manufactures is the limited shapes a part can be cut, molded or welded in. On the other hand, a digitally 3D-printed part can be molded into an infinite number of shapes. According to industry standard ASTM F2792-10, AM is defined as, "The process of joining materials to make objects from 3D model data, usually layer upon layer, as opposed to subtractive manufacturing technologies."

While some observers may argue that Apache Spark is causing the relevance of the Apache Hadoop community to wane, the fact of the matter is innovative Spark development depends on Hadoop platforms. Discover why Hadoop is stronger than ever as an open source information refinery that is expected to buttress development of 21st century analytics models.

data protection, big data, mobile, cloud computing, analytics, data scientists, data sources, analytics tools, data blending

Cloudera launches the One Platform Initiative to advance Spark as the data processing successor to MapReduce inside Hadoop.

This entry was posted in News and tagged , , , , , . Bookmark the permalink.