Big Data News – 28 Sep 2015

Top Stories
Earlier this month, Cloudera announced a plan to make Spark the center of its Hadoop distribution. Astute observers will note that Cloudera has been moving in this direction for some time. In a webinar yesterday, Doug Cutting, the creator of Hadoop and chief architect at Cloudera, offered more details about the company’s decision to replace MapReduce with Spark. His presentation had broad implications for the Hadoop ecosystem. Here are the six major takeaways from that talk.

 

There’s a lot of gray area when it comes to the ethical collection, use, and analysis of data. Consider these 8 issues organizations should ponder when assessing their data use practices.

 

An open-source storage engine called Kudu could soon be on the way from Cloudera, offering a new alternative for companies with big data stores to manage. Kudu will be offered as an alternative to the popular Hadoop Distributed File System and the Hadoop-oriented HBase NoSQL database, according to a VentureBeat report, which cited a slide deck on Kudu’s design goals. A small Cloudera team has reportedly been working on Kudu for the past two years. The company has already been pitching it to customers and plans to release it as Apache-licensed open-source software at the end of this month, VentureBeat said, citing a source familiar with the matter.

 

Microsoft announced three new tie-ups in China on the same day that the country’s President Xi Jinping and a delegation visited its campus at Redmond, Washington. The seven deals with Chinese companies and government institutions will likely give Microsoft greater access to the country’s large market. Other companies like Cisco Systems and Hewlett-Packard have also announced ties with Chinese companies, a market that has been proving complex for U.S. companies because of the strong backing of the government for local players. Microsoft, for example, announced an agreement with its cloud partner in Beijing, 21Vianet, and IT company Unisplendour to provide custom hybrid cloud solutions and services to Chinese customers, particularly state-owned enterprises.

 

Optimizing queries in Splunk’s Search Processing Language is similar to optimizing queries in SQL. The two core tenants are the same: Change the physics and reduce the amount of work done. Added to that are two precepts that apply to any distributed query.

 

Today Google is releasing Google Cloud Dataproc as a beta service. Cloud Dataproc gives you anytime access to super-fast, simple yet powerful, managed Spark and Hadoop clusters.

 

IBM is building new West Coast office in San Francisco for its IBM Watson system, while Big Blue has added some capabilities, including ones for social media and productivity.

 

Now that companies are recognizing the benefits of analytics and big data, the next step is putting those benefits within closer reach. Toward that end, MemSQL on Thursday unveiled a new tool designed to help companies tap Apache Spark without writing any code. Spark Streamliner is a tool that integrates MemSQL’s in-memory database and Apache Spark’s in-memory data-processing framework for streaming data from real-time sources such as sensors, Internet-of-Things (IoT) devices, transactions, applications and logs. Offering “one click” deployment of integrated Spark along with a Web-based interface, it allows users to create multiple data pipelines in minutes, perform custom transformations in real time and develop new analytics applications, MemSQL said.

 

All the analytics tools in the world won’t do a company much good if it doesn’t know what data it has to analyze. Tamr offers a free, downloadable tool designed to help tackle that “dark data” problem. Dark data generally refers to all the information an organization collects, processes and stores but doesn’t use for analytics or other purposes. It’s often unstructured or qualitative data that’s harder to keep track of than numerical data is, and by research firm IDC’s reckoning, it can account for as much as 90 percent of an organization’s information assets.

 

In modern IT architectures, the front end and back end are designed to function independently from one another, communicating through a set of agreed-upon interactions — most often via RESTful APIs. I have written previously on the different approaches for building the back end and designing the APIs. The potential impact of your APIs however is higher: they can be used in several ways that can impact your business. Back end for your own front end The primary use of the data back end is to serve, via APIs, the front-end systems that will be at the core of the customer experience. Usually (and depending on your business) these can include mobile apps, websites/applications, connected objects on the Internet of things, and so on.

 

A study about a software product commissioned by one of that product’s commercial sponsors — take it with a shaker full of salt. Still, Databricks’ latest survey about the use of the continuously evolving Spark big data processing engine turned up enlightening insights about where and how Spark is being put to work.

 

Two of the developers behind the KVM and OSv projects have now released and open-sourced a direct replacement for the Apache Cassandra NoSQL database that they say is an order of magnitude faster. ScyllaDB is meant as a substitute for Cassandra in the same way that MariaDB can be swapped in for MySQL without blinking. ScyllaDB is written in C++ as opposed to Cassandra’s Java, and its creators, Avi Kivity and Dor Laor, claim its sharded architecture provides the kinds of parallelism and speed-up on a single computer that was previously only available in a cluster.

 

Key messages on day two of the ‘Agile on the Beach’ conference included that self-organising teams who embrace ‘autonomy, purpose and mastery’ create a successful and resilient culture, there is need for technical leadership that encompasses programming, people and processes, and being truly agile involves working with ‘an ever evolving set of ever evolving practices’.

 

Eventual consistency is a design approach for improving scalability and performance. Domain events, a tactical element in Domain-Driven Design (DDD), can help in facilitating eventual consistency, Florin Preda and Mike Mogosanu writes in separate blog posts, each describing the advantages achievable.

 

Force12.io have released a prototype ‘microscaling’ container demonstration running on the Apache Mesos cluster manager, which they claim starts and stops ‘priority 1’ and ‘priority 2’ containers more rapidly than traditional autoscaling approaches when given a simulated demand for the differing workloads. InfoQ discussed the goals and methodology of this approach with Force12.io’s Ross Fairbanks.

 

Stephen Colebourne and OpenGamma released v1.1 of ElSql, a library and DSL for managing SQL in external files. Colebourne is well known for his work as the spec lead of Java Time, a cornerstone of last year’s Java 8 release, and for his creation of the Joda Time and Joda Money API’s.

 

BlueData, provider of a leading infrastructure software platform for Big Data, announced version 2.0 of its BlueData EPIC software platform.

 

Sanjiv Augustine talks about his new book “Scaling Agile: A Lean JumpStart”, reinventing organizations and the implementation of no-management at LitheSpeed and the Agile 2015 Executive Forum.

 

An introduction to Amazon’s ECS, their container management and clustering service, featuring a walk-through example for deploying your first container cluster and the underlying tasks and services.

 

Added by Marina Mitrashov on September 17, 2015

 

Added by Laetitia Van Cauwenberge on September 19, 2015

 

Added by Bernard Marr on September 16, 2015

 

Added by Damian Mingle on September 15, 2015

 

Added by Zygimantas Jacikevicius on September 16, 2015

 

Added by Laetitia Van Cauwenberge on September 20, 2015

 

Today Google is releasing Google Cloud Dataproc as a beta service. Cloud Dataproc gives you anytime access to super-fast, simple yet powerful, managed Spark and Hadoop clusters.

 

A surprisingly common theme at the Splunk Conference is the architectural question, “Should I push, pull, or search in place?”

 

Added by William Vorhies on September 22, 2015

 

Added by Isaac Sacolick on September 15, 2015

 

Objectivity, Inc., a pioneer in high-performance distributed object-oriented database technology, introduced ThingSpan, a purpose-built Information Fusion platform that simplifies and accelerates any organization’s ability to deploy Industrial Internet of Things (IoT) applications to enhance value derived from Big Data and Fast Data.

 

Consumers (and B2B customers) are more and more empowered with mobile devices and cloud-based, all but unlimited access to information about products, services, and prices. Customer stickiness is…

 

Added by William Vorhies on September 8, 2015

 

Added by William Vorhies on September 15, 2015

 

Added by Flavio Bossolan on July 9, 2015

 

Added by Zeeshan Usmani on March 28, 2015

 

Strata + Hadoop 2015 brings together technology masters next week in New York to consider the future of data and machines. It is a tricky subject. Ubiquitous data collection and ever-smarter machines can make us all feel grim and dumb. But there is room to be light-hearted and intelligent. Here are three big show themes that underscore how data and software will improve how we live, work and play.

 

AtScale, the company to provide business users with fast and secure self-service access to Hadoop, announced that it had partnered with Cloudera, MapR and Tableau to release the world’s largest and industry’s first worldwide Hadoop Maturity survey.

 

In this special guest feature, Sethuraman Janardhanan of Happiest Minds discusses how capturing and analyzing data is going to be a huge challenge and an immensely profitable and critical activity in the very near future.

 

In this article, author Carlos Bueno discusses the strategies for estimating the server capacity for big data projects and initiatives, with help of two case studies.

 

Added by ajit jaokar on September 19, 2015

 

Added by Laetitia Van Cauwenberge on September 13, 2015

 

Added by fabio mainardi on September 22, 2015

 

Posted by Laetitia Van Cauwenberge on September 8, 2015

 

This series aims at introducing all that is essential for developers to know about building apps for the latest release of Apple’s mobile OS. It comprises five articles that cover what’s new in iOS 9 SDK, new features in Swift, Objective-C, and developer tools, and Apple’s new bitcode.

 

Kent McDonald talks about the need for product ownership, business analysis and user experience in agile projects, how the three areas are connected and his new book – Beyond Requirements: Analysis with an agile mindset

 

Based on their experience with arbitrarily shutting down servers or simulating the shutdown of an entire data center in production, Netflix has proposed a number of principles of chaos engineering.

 

Added by Marina Mitrashov on September 17, 2015

 

Big Data’s fall season is upon us, and Strata+Hadoop World NYC is its coming out party. There will be a multitude of announcements, but more than likely a manageably-sized set of key themes. Here’s a few to consider.

 

It’s no secret that analytics are everywhere. We can now measure everything, from exabytes of organizational “big data”  to smaller, personal information like your heart rate during a run. And when this data is collected, deciphered, and used to create actionable items, the possibilities, both for businesses and individuals, are virtually endless. One area tailor-made for analytics is the sports industry.

 

The prevalence of big data and the value it generates has greatly, and perhaps indelibly, altered contemporary business practices in two pivotal ways. Firstly, the value of just-in-time analytics is contributing to a reality in which it is no longer feasible to wait for scarcely found data scientists to compile and prepare data for end-user usage. Widespread end-user adoption hinges on a simplification and democratization of big data that transcends, expedites, and even automates aspects of data science. Secondly, big data has resulted in a situation in which enterprises must account for copious amounts of external data that are largely unstructured. The true value in accessing and analyzing these data lie in their integration with traditionally structured internal data for comprehensive views. Historically, integration between external and internal data has been hampered by inordinate time consumption on security and data governance concerns.

 

Without a doubt, technology is dramatically reshaping today’s workforce. The IBM and Box partnership is designed to transform how people work in their industries. Indeed, government, healthcare, insurance and retail are among the industries that can substantially benefit from offerings that combine leading-edge Box and IBM technologies. Attend BoxWorks 2015, 28–30 September 2015, and learn how organizations can tap into the tremendous opportunities for transformation without putting their critical information at risk.

 

Arcadia Data, provider of a unified visual analytics and business intelligence (BI) platform for big data, announced the release of Arcadia Enterprise, a full-scale visual analytics and BI solution that runs natively in Hadoop.

 

Sure, the Solar System is big, but it’s probably a lot bigger than you think, thanks to textbook representations that squeeze all the planets and their orbits into one page. Even at the speed of…

 

Project Roslyn is Microsoft’s next generation .Net compiler. It’s API allows you to dig into the details of any C# or VB Code. It can be used to improve your code by doing deep analysis and custom rule enforcement. In this presentation, we will look at how you can get started with the Roslyn C# API.

 

As enterprise databases continue to grow, a semantic data model and an efficient data warehouse becomes necessary.  Finding a platform that can integrate cohesively with your existing data storage methods can be tricky. This informative document shows how graph-aware tools can eliminate the need to compromise between a data warehouse and data swamp, allowing the business to arrive at insights faster than ever before. Read on to see how this database technology can allow your company to work with more flexibility and greater accuracy, turning your data assets into a competitive advantage.

 

Although more and more organizations implement mobile business intelligence (BI) solutions, they can be intimidating to some businesses, particularly those who have challenges with their existing mobile BI solutions. The good news is that making mobile BI solutions a success doesn’t have to be agonizing. There are several investments that can tremendously benefit any mobile…

 



A Spark “Streamliner” introduced this week by in-memory database vendor MemSQL aims to provide Spark users quick access to real-time analytics and transactions. San Francisco-based MemSQL said Wednesday (Sept. 24) its new platform addresses the growing enterprise need to “synthesize varied data types,” including historical data. Hence, its real-time data pipeline between Apache Spark and MemSQL is intended as an easier way to deploy multiple pipelines as a way to keep up with dynamic data flows. The Streamliner tool is designed as a single-click deployment of integrated Apache Spark “to eliminate the pain of batch ETL,” the company said. A web-based user interface is designed to allow for multiple real-time data pipelines. The tool also capitalizes on Apache Spark’s inroads in the enterprise. In June, for example, IBM said it would integrate the open source in-memory processing framework into the “core” of its analytics and commerce platforms. It also said it would work closely with Databricks, the company formed by the creators of the analytics engine.

 

Matt Buckland discusses about some of the cultures he has encountered in his work experience, the success stories and the failures, outlining what makes a great organizational culture.

 

Internet pioneer Larry Smarr once had a vision of bringing connected computers out of academia and into the consumer world. Today, he envisions a second virtual highway, one capable of delivering on the promise of big data by leveraging fiber optic networks to transmit data at speeds of 10 gigabits to 100 gigabits per second. The idea is similar to NSFnet, which became the backbone of the Internet in the 1980s. Like NSFnet, Smarr hopes this new network — the Pacific Research Platform — will be a template others will adopt. Smarr, who is founding director of the director in 2000 of the California Institute for Telecommunications and Information Technology (Calit2), a University of California San Diego/UC Irvine partnership, sat down recently with EnterpriseTech to discuss Pacific Research Platform

 

Hortonworks has quietly made available the DataFlow platform which is based on Apache NiFi and attempts to solve the processing needs of the IoAT.

 

 

Big data is helping to define and improve in-person experience of sports fans. Big data and wearable technology are combining to offer fans a better, more informed in-person game experience. First, wearable technology has grown tremendously over recent years. Athletes, coaches, and trainers have embraced wearable technology as a way to monitor and improve athlete performance, from gauging metrics like heart rate or timing reactions to monitoring the body for signs of concussion or fatigue.

 

Ratpack, a high performance Java web framework, has reached 1.0 status. The 1.0 release is API-stable and can be considered production ready. The main thing that makes Ratpack interesting is the execution model, which aims to make asynchronous programming on the JVM easier.

 

Pivotal announced a complete re-design of Spring XD, its big data offering, during last week’s SpringOne2GX conference, with a corresponding re-brand from Spring XD to Spring Cloud Data Flow. The new product is focussed on orchestration.

 

Although predictive maintenance offers significant benefits to personnel who operate and maintain essential assets, its dependence on additional on-premises IT resources can deter organizations from relying on it. Learn how an IBM cloud-based solution can help give organizations insight into asset performance.

 

Attend IBM Insight 2015 to watch the semifinalists in the Hack the Weather hackathon present their projects to a panel of judges, and learn how to use weather-driven analysis to address your own analytics-related business challenges.

 

Weather can be just as important a factor in retail success as location is. Both by boosting planning efficiency and by mitigating supply chain risk, weather data analytics can help retailers predict and meet customer demand–rain, snow or shine.

 

The RHadoop packages make it easy to connect R to Hadoop data (rhdfs), and write map-reduce operations in the R language (rmr2) to process that data using the power of the nodes in a Hadoop cluster.

 

The RHadoop packages make it easy to connect R to Hadoop data (rhdfs), and write map-reduce operations in the R language (rmr2) to process that data using the power of the nodes in a Hadoop cluster….

 

Marketers have flocked in droves to the concept of real-time marketing (RTM). At its core, RTM is about reacting in the moment to events or conversations with relevant and timely content. Do you remember the Super Bowl a few years back when the lights went out due to a power cut? Oreo was on it, instantly tweeting an image of a cookie in the dark with the tagline, “You can still dunk in the dark.”

 

Identifying and resolving issues pertaining to technical debt often take a back seat since development teams prefer to develop new features rather than perform refactoring to repay technical debt. The article emphasizes the need of a balance between feature development and technical debt repayment and outlines pragmatic strategies that software projects could adopt to manage technical debt.

 

This article is the fourth in an editorial series with a goal of directing line of business leaders in conjunction with enterprise technologists with a focus on opportunities for retailers and how Dell can help them get started. The guide also will serve as a resource for retailers that are farther along the big data path and have more advanced technology requirements.

 

Platfora introduced Big Data Discovery 5.0, making it the single most powerful, flexible and complete Hadoop- and Spark-native analytics platform for data-driven organizations. Version 5.0 includes a more tightly integrated workflow with new features from Platfora and critical big data discovery technologies including Apache Spark, SQL and Excel.

 

At the combined Agile Alliance and Agile Open Northwest Open Space event in Portland Declan Whelan and Diana Larsen led two sessions in which they explored the application of the Agile Fluency model for understanding and addressing technical debt and showed a game based on the model which teams can use to help them identify practices and principles they want to adopt based on their fluency goals

 

Tom Limoncelli explains the reasons for DevOps, how to choose which steps to automate and which not, enabling continuous deployment, and much more.

 

An agile workplace is one that is constantly changing, adjusting and responding to organizational needs. How to create an Agile workplace and what are the benefits.

 

Earlier this year, a Gartner survey was a bit conservative on Hadoop adoption. But from our discussions in Seattle, what we heard is that enterprises foresee Hadoop as an integral component of their data architectures.

 

The McKinsey Global Institute has predicted that by 2018, the US alone could face a shortage of between 140,000 to 190,000 people with deep analytical skills, and a shortage of 1.5 million managers and analysts who can leverage data analysis to make effective decisions for their organisations.  Big Data is not just another fad. In an increasingly digital world, Big Data plays a very important role in driving decisions, innovation and productivity in large multinationals, non-profits and governments. It can be used to analyse social media trends to formulate election strategy, evaluate meteorological data to predict the weather or even to analyse retail data to drive more sales.

 

Business professional of all levels have asked me over the years what is it that I should know that my Data Science department may not be telling me. To be candid, many Data Scientist operate in fear wondering what they should be doing as it relates to the business. In my judgment the below questions address both parties with the common goal of a win-win for the organization — helping Data Scientist support their organization as they should and business professionals becoming more informed with each analysis.

 

“Finding a needle in a haystack” is perhaps the most overused quote of the data science trade, with most material promising to sift/burn/search the haystack faster than before to find the vast stack of needles it hides underneath. In reality, however, there is not always a needle and even when there is, we may not know what it looks like. Within the context of data science, the haystack is the mass of available data while the needle is the valuable insight or answer. Metaphor-extending shenanigans aside, this idiom is misleading: it implies we know how the needle looks like, and also that the insight is distinctly different from the data; in fact insights come from the data. Too often, a failure to find the needle is attributed to the staff performing the analysis or the technology used. In a previous post about the scientific method , I argued that there was no such thing as failure: a hypothesis can be rejected or not, but both are valid answers in the eyes of science (if not in the eyes of the business).

 

Mario Areias presents the challenges in delivering care in Haiti, dealing with legacy code and infrastructure limitations.

 

Added by William Vorhies on August 31, 2015

 

Added by Flavio Bossolan on July 9, 2015

 

Hadoop co-creator Doug Cutting said today that Apache Spark is “very clever” and is “pretty much an all-around win” for Hadoop, adding that it will enable developers to build better and faster data-oriented applications than MapReduce ever could. Cutting talked at length about Spark during today’s Cloudera webinar, titled “Uniting Spark and Hadoop: The One Platform Initiative.” The Hadoop distributor, which employs Cutting as its chief architect, has ramped up the pro-Spark messaging since the launch of the One Platform Initiative earlier this month.

 

Hadi Michael explores the elements commonly found on developer portals, and identifies those that consistently contribute to superior developer experiences.

 

The global population is forecast to top 7.7 billion human beings by 2020. As weather patterns change and, with them, global agricultural production, a new dataset containing the genome sequences of more than 3,000 rice varieties are being made available to researchers working to figure out how to feed the world. The International Rice Research Institute (IRRI) along with the Chinese Academy of Agricultural Sciences and BGI Shenzhen compiled the 120-terabyte dataset, which is available now on the Amazon Web Services (AWS) cloud platform. The project sequenced the genomes of 3,024 rice varieties from 89 countries. AWS said this week the consortium partnered with DNAnexus Inc., which operates a cloud platform for sharing genomic data and tools, to process source genomic data across 37,000 AWS compute cores. The process took two days.

 

Earlier this year, a Gartner survey was a bit conservative on Hadoop adoption. But from our discussions in Seattle, what we heard is that enterprises foresee Hadoop as an integral component of their…

 

ARIN, the resource registry that hands out allocations for IPv4 addresses, has announced that it has no more IPv4 addresses to give out. Although this doesn’t mean no more IPv4 addresses will be allocated, it has brought to an end the question of when such addresses will run out. Meanwhile, IPv6 usage continues to climb with the release of iOS 9.

 

In theory the operations team determines what the thresholds for warnings and alerts should be. But in practice, the operations team often have no idea what these values should be. Using machine learning techniques such as adaptive thresholds, Splunk ITSI solves this problem.

 

PHEMI, the company delivering privacy, security, and governance for big data, introduced Zero Trust Data, an innovative, comprehensive approach to remove critical roadblocks standing in the way of enterprises aiming to become data-driven.

 

The DBA’s primary job is to ensure that the business’s information is always available, with performance coming in at close second. We’ve already talked about optimizing distributed queries in Splunk and map-reduce queries in Hunk. In this report we expand upon that with more information that a DBA needs to know about Splunk databases.

 

Big data is becoming an increasingly important part of the business plan for companies in many different industries. Analyzing large customer datasets and other kinds of data with tools like Hadoop reporting lets companies save money as well as boost revenue by targeting their marketing better, designing products to better appeal to their customers, make better predictions, and so on. On the other hand, this rise in the use of big data has coincided with the rise of advanced persistent threats to data security.

 

All the analytics tools in the world won’t do a company much good if it doesn’t know what data it has to analyze. Enter Tamr, which now offers a free, downloadable tool designed to help tackle that “dark data” problem. Dark data generally refers to all the information an organization collects, processes and stores but doesn’t use for analytics or other purposes. It’s often unstructured or qualitative data that’s harder to keep track of than numerical data is, and by research firm IDC’s reckoning, it can account for as much as 90 percent of an organization’s information assets.

 

 

Pharmaceutical companies spend billions testing prospective drugs by conducting “wet lab” experiments that can take years to complete. But what if the same results could be obtained in a matter of minutes by running computer model simulations instead? A Silicon Valley startup says it has created a novel machine learning algorithm that does just that. twoXAR (pronounced “two-czar”) was founded last year by two men both named Andrew Radin (more on that later). The Radins were interested in using advances in data science and large-scale computing to speed up the pace of drug discovery, which would give pharmaceutical companies better candidates for clinical studies.

 

Since the partnership between Hortonworks and SAS we have created some awesome assets (i.e., SAS Data Loader sandbox tutorial, educational webinars and array of blogs) that have enabled Hadoop and Big Data enthusiasts’ hands-on training with Apache Hadoop and SAS’ powerful analytics solutions.

 

ScyllaDB is releasing a new NoSQL data store that delivers 10x the performance of Apache® Cassandra, with full compatibility with the CQL query language.

 

Data scientists may be of a different breed from other analytics team members, but they are essential for bringing to the table curiosity about data and an unquenchable thirst for finding patterns and relationships in that data. Discover how combining the roles of data scientist, business analyst, IT professional and database administrator into a solid analytics team can benefit organizations with insightful outcomes for making well-informed decisions.

 

The insurance industry is poised to leverage business intelligence from Internet of Things data to offer policyholders value-added services and process optimization. But these opportunities require getting beyond mere point solutions and deploying a comprehensive, scalable and flexible Internet of Things platform for insurance implementations.

 

Financial analytics software transforms the chief financial officer role from number cruncher to strategic business partner. This is accomplished with faster, more accurate forecasting, simplified and less conflict-laden planning and quicker decision making with real-time, trusted financial metrics.

 

Predictive Analytics World is returning to London on 28-29 October! Predictive Analytics World is the leading cross-vendor event for predictive analytics professionals, managers and commercial practitioners. PAW focuses on concrete examples of deployed predictive analytics and delivers case studies, expertise and resources to achieve: Bigger wins: Strengthen the impact of predictive analytics deployment Broader capabilities: Establish new opportunities in data science Big data: Leverage bigger data for prediction and drive bigger value Join PAW London to learn exactly how top practitioners deploy predictive analytics, and the business impact it delivers.

 

by Joseph Rickert This week, the Infrastructure Steering Committee (ISC) of the R Consortium unanimously elected Hadley Wickham as its chair thereby also giving Hadley a seat on the R Consortium…

 

Published Date: 2015-09-24 15:20:31 UTC Tags: Analytics, Big Data, Data Science, Predictive Analytics, Strategy Title: Could Governments Use Pricing Analytics In Pharma? Subtitle: After the controversies surrounding Martin Shkreli, could data stop price gouging?

 

Learn how the UNC Health Care System developed an advanced care insights solution to convert unstructured data into useful alerts and reports designed to help physicians and patient care managers enhance patient care while cutting readmission rates.

 

In our world of ever more connected devices, remote control is poised to bring significant benefits but also poses great risk. Whether installing firmware updates or guarding against unauthorized access, manufacturers must design their devices to ensure both safety and security.

 

Selling your future IT Strategy to the board is a great opportunity to make your mark, but your vision is unlikely to gain traction if your sales pitch relies on pure tech speak.

 

The adage “the customer is king,” has never been more relevant than it is now. It’s imperative for you to understand exactly how they think and behave, especially if you seek a strong ROI from your customer data analytics.

 

While the last two years or so have welcomed the advent of NoSQL databases with unbridled enthusiasm, there are still many obstacles which must be overcome before they can become fully accepted among the more established enterprises. Below are a few of these obstacles: 1.     Less mature RDBMSs have been around a lot longer than NoSQL databases. The first RDBMS was released into the market about 25 years ago. While proponents of NoSQL may present this as a disadvantage citing that age is an indicator of obsolescence, with the advancement of years RDBMSs have matured to become richly functional and stable systems. In contrast…

 

An article in The New York Times touched off a spirited debate about the work culture at Amazon, now the most valuable retailer in the country. One of the central issues at hand is that Amazon uses data not only to provide an exceptional customer experience, but also to manage its staff and improve productivity. To many of us, this news comes as no surprise. For years now, companies of all sizes have been increasingly turning to data analytics to improve employee engagement and performance. I also see it as indicative of the broad societal trends happening in today’s “data culture.”

 

 

IBM has written a new 8 page white paper “IBM Storage with OpenStack Brings Simplicity and Robustness to Cloud” to review the increasingly popular OpenStack cloud platform and the abilities that IBM storage solutions provide to enable and enhance OpenStack deployments.

 

At this year’s nginx.conf, Nginx has announced a preview of nginScript, a JavaScript-based server configuration language. Meant to accompany existing scripting offerings like Lua, nginScript will give technologists with experience in JavaScript a lower barrier to entry to create more advanced configuration and delivery options.

 

In this book excerpt published on SearchBusinessAnalytics.com, I write about why taking a siloed approach to creating a BI architecture framework leads to problems. The excerpt is from chapter 4 of my book Business Intelligence Guidebook: From Data Integration to Analytics. In the chapter, I provide insight as to how the BI environment in many organizations has been waylaid by the siloed approach to IT and application development. I also explain the benefits of a comprehensive and well-planned BI architecture strategy, and list the four architectural layers of a BI framework. View the excerpt on SearchBusinessAnalytics.com

 

Swiss Postal Services has used scaled Scrum with seven teams to replace a legacy system. InfoQ interviewed Ralph Jocham about how they scaled Scrum and dealt with legacy issues, using a definition of done, how they managed to deliver their system three months earlier than planned, and the main learnings from the project.

 

Today a study will come out saying that Spark is eating Hadoop — really! That’s like saying SQL is eating RDBMSes or HEMIs are eating trucks. Spark is one more execution engine on an overall platform built of various tools and parts. So, dear pedants, if it makes you feel better, when I say “Hadoop,” read “Hadoop and Spark” (and Storm and Tez and Flink and Drill and Avaro and Apex and …).

 

Ryan Polk talks about the future direction of the Rally product and the merger with Computer Associates

 

As infrastructure becomes code, reviewing (and testing) provides the confidence necessary for refactoring and fixing systems. Reviews also help spread consistent best practices throughout an organization and are applicable where testing might require too much scaffolding.

 

Puppet Labs’ latest version of Puppet Enterprise – version 2015.2 – includes new features like node graph visualization, inventory filtering, and a VMware vSphere module. It provides users with major enhancements of the Puppet Language and an updated web UI. InfoQ spoke with Michael Olson, Senior Product Marketing Manager at Puppet Labs .

 

Cloud computing can be a highly competitive market, both for the companies that provide a cloud service and the employees that help those companies excel. To better compete, employees whose jobs are associated with the cloud often need to adopt and cultivate a long list of valuable skills that providers and other businesses will want to see from their workers. Given the evolving nature of the cloud industry, this list is constantly changing, with some skills receiving more attention and importance than others.

 

Microsoft announced three new tie-ups in China on the same day that the country’s President Xi Jinping and a delegation visited its campus at Redmond, Washington. The seven deals with Chinese companies and government institutions will likely give Microsoft greater access to the country’s large market. Other companies like Cisco Systems and Hewlett-Packard have also announced ties with Chinese companies, a market that has been proving complex for U.S. companies because of the strong backing of the government for local players. Microsoft, for example, announced an agreement with its cloud partner in Beijing, 21Vianet, and IT company Unisplendour to provide custom hybrid cloud solutions and services to Chinese customers, particularly state-owned enterprises.

 

Optimizing queries in Splunk’s Search Processing Language is similar to optimizing queries in SQL. The two core tenants are the same: Change the physics and reduce the amount of work done. Added to that are two precepts that apply to any distributed query.

 

If you could handle all of the data you need to work with on one machine, then there is no reason to use big data techniques. So clustering is pretty much assumed for any installation larger than a basic proof of concept. In Splunk Enterprise, the most common type of cluster you’ll be dealing with is the Indexer Cluster.

 

Winterkorn says the company needs new leadership to win back trust.

 

During the second technical keynote at SpringOne2GX last week Guillaume Laforge talked about plans for Groovy 2.4.x and 2.5. Perhaps the most significant is improved compiler performance with a new Abstract Syntax Tree (AST) class reader in place of using class loading tricks.

 

Apple’s acquisition of the mapping visualization startup Mapsense adds to the growing momentum behind analyzing and rendering location data. According to multiple reports, Apple (AAPL) acquired the San Francisco-based startup recently for an estimated $30 million. Mapsense was founded in 2013 by Erez Cohen, who previously worked for data analytics specialist Palantir Technologies (PALTECP).

 

Unprecedented opportunities await enterprises that are involved with the Internet of Things–but only if they apply analytics to their production or operational processes with Predictive Asset Optimization. This addition can help prevent costly delays, maximize assets and improve the consumer experience in the long run.

 

This entry was posted in News and tagged , , , , , , , , . Bookmark the permalink.