Big Data News – 09 May 2016

Today's Infographic Link: The Magnificent Multitude of Beer

Top Stories
The prpl (pronounced "purple") Foundation, a non-profit, community-driven, collaborative, open source foundation, has published a peer-reviewed guidance document on IoT security. "I would even suggest starting a prpl Foundation working group to engage other IoT vendors," said Mike Janke, chairman & co-founder of Silent Circle, an encrypted communications firm based in Switzerland.

Performing programmatic actions on data across services is quite possible in today's technology ecosystem. And now, the transfer of data across services such as the dashDB data warehouse and deploying it in new environments is also possible. However, the questions often asked by customers center on "how?" Find out a strategy for addressing these questions.

The first ever global topographic map of Mercury was released by the U.S. Geological Survey, Arizona State University, Carnegie Institute of Washington, Johns Hopkins University Applied Physics Laboratory and NASA. It was built from the largest dataset ever processed by the USGS Astrogeology Science Center.

The data science automation specialist DataRobot Inc. is gaining traction in the big data market for its machine-learning application as new investors like Intel Capital fund its expanding operations. Boston-based DataRobot has so far raised more than $57 million in four equity investment rounds, including a $33 million funding round completed in February.

Mobile usage has reached an all-time high, and the number of mobile-only Internet users continues to rise. In 2016, 80 percent of North America will access the Web through their mobile devices, meaning that brands need to focus on the mobile experience to remain competitive.

The famous tech startup bootcamp, Y Combinator, is best known for connecting promising upstart tech businesses with venture capital. Its second and lesser-known but equally important function, is to coach the owners of those startups toward the best possible business model for their organization. What that model turns out to be is determined by many factors like the character, drive, and interests of the people involved, their range of expertise, and their projected potential to service some void in the market.

In order to improve customer loyalty, we need to listen to our customers more. Increasing our share of wallet and maximising customer lifetime value (CLV) will only happen when customers are prepared to choose our brands and products over our competitors. Fortunately, businesses are uncovering clues to improving customer loyalty in new places thanks to data and evolving analytics platforms. This led one bank in Canada to turn to the thousands of conversations between its representatives and its customers for new insights into elevating the customer experience.

In anticipation of his upcoming Predictive Analytics World for Manufacturing conference keynote presentation, Changing the Way we Make Things: The Brilliant Factory, we interviewed Dr. Matteo Bellucci, Manager, Process System Lab at General Electric. View the Q-and-A below for a glimpse of what's in store at the PAW Manufacturing conference. Q: What are the challenges in translating the lessons… The post Wise Practitioner – Manufacturing Predictive Analytics Interview Series: Dr. Matteo Bellucci at General Electric appeared first on Predictive Analytics Times.

In this special guest feature, Tara Kelly of SPLICE Software discusses the link between Data Hygiene and Customer Retention and also how to gauge where your data stands so you can clean it up and reap the rewards.

Above the Trend Line: machine learning industry rumor central, is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items such as people movements, funding news, financial results, industry alignments, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

Stockpile makes it possible to invest in the likes of Apple, Netflix, Facebook, Google and Coca-Cola by simply purchasing a gift card for the stock.

I am a dinosaur. The reports of the impending death of the PC are not greatly exaggerated. The PC is yesterday's technology, its demise hastened by smartphones, tablets and the cloud-computing services that make large-capacity computing devices unnecessary. But I am part of a shrinking minority, still doing most of my work on PCs. In my case, they are Linux-powered PCs, but that is not the point. I would still be a dinosaur if I used Mac OS X or Windows. You've probably seen the numbers that show what an oddity I am. Gartner reported that worldwide PC shipments dropped by 9.6% in 2016's first quarter. IDC said: No, it was worse. By its count, the PC market dropped by 11.5%. The two firms agreed that fewer than 65 million units shipped. That's the worst the PC market has been since 2007. Even Apple saw Mac sales drop by 12% year over year in its latest quarter.

A summary of five CS papers chosen from the 66 that Adrian Coyler has reviewed for his Morning Paper blog during the first quarter of 2016. Topics include distributed transactions, transaction recovery, and Hyperloglog.

The first Agile Alliance Technical Conference was recently held. The conference had a strong focus on the strong technical skills needed to make agile software development effective and covered a wide range of technical topics. There were two keynote talks by Sandi Metz on the challenges to professionalism and Uncle Bob Martin on why it is so important. The conference videos are now available.

Data generation, collection, and analysis are making their way into more types of products and services. The trend is creating new opportunities for innovation, some of which are so impactful, they're causing some companies to revisit their business models. The path to success isn't always obvious, however, so here are a few best practices to keep in mind.

Huge volume of network data has made it all but impossible for the good guys to detect new security threats, which has created space for the bad guys to operate. But thanks to a new Apache big data project called Open Network Insight (ONI), the good guys now have a powerful way to cut through the noise and identify bad guys and their malicious schemes.

Contact centers around the globe are running two sets of expensive software: CRM and Real-Time Communications. Tsahi Levent-Levi shows how, by integrating WebRTC, companies can become more flexible and save money. Using only a browser, with no additional software or plug-ins to install, call centers can distribute their work force around the globe. By Tsahi Levent-Levi

News: Dataminr sells data analytics information to US intelligence agencies and helps them in detecting terrorist activities.

Apache Hadoop is a well know and de-facto framework for processing large big data sets through distributed & parallel computing. YARN(Yet Another Resources Negotiator) allowed Hadoop to evolve from a simple MapReduce engine to a big data ecosystem that can run heterogeneous (MapReduce and non-MapReduce) apps concurrently. This results in larger clusters with more workloads and users than ever before.

The Salt River Project turns to analytics and operational intelligence software to chart the future of asset performance and its impact on grid reliability.

Perhaps more than any other, the healthcare industry is undergoing dramatic upheaval. Changes in reimbursement models, continued merger and acquisition activity, regulatory requirements, and the increased focus on quality of patient care are forcing many hospitals and healthcare providers to reinvent themselves.   Patients and customers are driving change, and healthcare providers struggle to follow their lead. In short, many in the healthcare industry are working to build a new ecosystem with the patient and consumer at their core. One of the top challenges for information technology leaders in healthcare is aligning technology capabilities with the needs of the organization, regulatory requirements and wants of the patient. That is driving tremendous interest in data analytics.

(cross-posted to Bob German's Vantage Point) This week Microsoft mapped out a bold new plan for SharePoint. Microsoft is investing heavily to modernize the product to make it work as well in the Device and Cloud era as it once did in when the Web was still shiny and new. This article will explain how these changes could affect your organization's SharePoint plans and how you can start preparing. If you missed the announcement, this is a good place to start. Here are some of the highlights: Modern user experiences: New SharePoint mobile apps for iOS, Windows, and Android; extensive website updates including responsive, mobile-friendly team sites, list and library UI's; the SharePoint Framework, a new development platform for developing the new responsive web pages and client side web parts. Modern pages and web parts can be added to existing sites to allow a transition.

A spate of high-profile public and private cloud security breaches is helping to push advancements in security such as encryption. Here's a look at 7 ways the cloud may be the largest driver of IT security today.

Dell announced a major new release of its award-winning Statistica advanced analytics platform, Dell Statistica version 13.1. This latest version delivers a host of capabilities designed to empower citizen data scientists, help organizations better address growing IoT analytics requirements, and better leverage increasingly heterogeneous data environments.

InformationWeek unveiled its Elite 100 winners this week at its annual conference in Las Vegas. The top 5 projects this year all put the focus on data and analytics. Here's our Big Data Roundup for the week ending May 8, 2016.

As the body of data that governmental and public bodies hold continues to increase exponentially, a strong drive is under way to put more of this information in public hands. However, there is great debate over exactly where value can be found from opening up your data to the public. To take a deeper drive into this issue, I caught up with some speakers from the upcoming CDO Forum, Government (6-7 June, Washington D.C.)

In this contributed article, Russ Elsner, Architect — Office of the of the CTO at ScienceLogic, discusses the challenges of running a modern IT operations infrastructure where innovation is key.

Dave Farley discusses using acceptance testing to work quickly and effectively, building functional coverage for complex enterprise-scale systems, and managing and maintaining those tests. By Dave Farley

Mercurial has reached version 3.8. This release brings cHg a new Mercurial command server client aimed to improve access to Mercurial API and circumvent potential licensing issues. Additionally, Mercurial 3.8 brings improvements to many commands and extensions, and various performance improvements.

6sense, a predictive intelligence platform for B2B marketing and sales, announced the launch of 2sense, a first of its kind product to solve the biggest problem plaguing marketing and sales teams — understanding buyer timing.

Chris Young and Kate Gray talk about applying methods used in political campaigns to the workplace to achieve goals and to influence and change a situation for the better. By Chris Young, Kate Gray

EnterpriseDB® (EDB™), a leading enterprise Postgres database company, announced EDB Postgres™Advanced Server, an enhanced version of PostgreSQL for mission-critical enterprise workloads, has been certified for the Hortonworks Data Platform (HDP).

In recent blogs, I have been distinguishing between quantitative data and narrative data. I believe that I separated the two forms relatively well. Although I originally focused on the differences in data in order to give narrative "its own space," actually there can be a symbiotic relationship between the two types of data. In my last blog, I said that quantitative data can be incorporated into narrative data. In my submission today, I will be discussing how the narrative can be used to develop metrics. There are various reasons one might wish to do so. If the narrative is repetitive – and there is some desire for its control or regulation – the metrics can be used to monitor future cases and determine the extent to which intervention has been successful. If there is a need to determine how to direct assets, the allocation can be premised on such metrics.

A federal district court denies Facebook's request to dismiss a case that alleges that the social media giant violated Illinois biometric privacy laws with its photo-tagging feature.

Basho Technologies, the creator and developer of Riak® KV and Riak® TS, the resilient NoSQL databases, announced the latest release of Basho Riak TS, version 1.3. Riak TS is an enterprise-grade NoSQL database optimized for Internet of Things (IoT).

Agile Marketing, as a concept, has been talked about quite a lot in the recent past and it is much more than just a hot new buzzword. Bhoomi Mehta takes a brief look at what Agile Marketing is and how it works. She also delves into other aspects that affect marketing processes and projects to understand how they can help your organization get an edge over its competition. By Bhoomi Mehta

Cornelia Davis talks to Rags Srinivas about the importance of software transformation and the importance of feedback, Continuous Integration and Delivery and how culture and technology play a role in the transformation process. By Cornelia Davis

Digital marketing is the promotion of products or brands via one or more forms of digital technologies to attract, engage and convert customers online. For a business to succeed in today's Digital…

Sam Adams talks about the mature pipeline and practices of LMAX Exchange. He demonstrates the variety of testing they have built in, how good test isolation has enabled them to extend their functional tests into live monitoring of production, and how a commitment to incremental delivery, quality and automation have created a sustainable environment for producing great software fast. By Sam Adams

Sid Anand discusses how Agari is applying Big Data best practices to the problem of securing its customers from email-born threats. Anand presents a system that leverages the best of both the cloud (AWS's SNS, SQS, Kinesis, Auto-scaling, S3, Lambda, API Gateway, etc…) and Big Data (Spark, Airbnb's Airflow, etc…) in a maintainable way. By Sid Anand

Richard Astbury demonstrates three new programming languages and discusses how they will affect the future direction of computer programming. By Richard Astbury

Big data has made a huge mark on virtually every industry on earth. From manufacturing goods to transporting same, from assessing financial risk to determining security risk, big data has you covered. But no industry has embraced and utilized big data more so than marketing. Big data tells them whom to market to, what to market, where, and how. In addition to revolutionizing marketing in a general way (by telling you which customers might be most interested in your add-ons or upgrades or what customers want the option to pay by check instead of credit card), it is also upending the content marketing biz. Here's how. Big Data Tells You What Answers Your Leads & Customers Need Your Content to Answer The right content improves conversion rates, drives customer service ratings upwards, and can even improve the lifetime value of your existing customer relationships.




The United States contains nine cities named Rome. Like almost everywhere in the USA, those cities are connected by roads, and from every point in the country, one of those cities can be reached by…

News this week included the Community Connect program, IoT costs, broader Windows 10 app distribution, Linux badges, quantum computing for everyone.

Six common challenges health care executives experience when it comes to data governance and how they can overcome them.

Spare5 announced general availability of its Intelligent Crowdsourcing Platform to provide product leaders the human insights they need to act on their unstructured data.

By Dr. Thomas Wiecki, Lead Data Scientist at Quantopian Earlier this year, we used DataRobot to test a large number of preprocessing, imputation and classifier combinations to predict out-of-sample performance. In this blog post, I'll take some time to first explain the results from a unique data set assembled from strategies run on Quantopian. From these results, it became clear that while the Sharpe ratio of a backtest was a very weak predictor of the future performance of a trading strategy, we could instead use DataRobot to train a classifier on a variety of features to predict out-of-sample performance with much higher accuracy.

You might think that in today's big data world it's all about advanced analytics, but Tech Target's IT Priorities survey shows basic BI software tools are still a hot commodity.

Microsoft MVP Doug Finke asks "Who moved the cheese" about Microsoft's new modern editor, Visual Studio Code. It's code editing, redefined, free, open source and runs everywhere. " Intelligent editing " Powerful debugging " Built-in Git support " Hundreds of extensions at Doug's Blog: Development In A Blink

With the advent of cloud computing and fully virtualized infrastructure, the foundation for utility computing is finally in place.

Genymotion Cloud provides access to Android emulation software; makes it simpler to build apps that run on top of multiple distributions of Android.

The world is all abuzz talking about big data and using analytics to help improve business results. Ironically, software teams rarely use data to analyze and improve their own processes. Many internal software teams today don't track fairly simple metrics including: how often code is submitted, by whom and the quality impact. By monitoring basic activities and others, software teams can identify patterns of behavior and best practices. That's not to say that engineering managers don't see the need for improvement. Software teams are under ever-tight deadlines, often operating with less than the ideal number of team members or lacking enough of the needed skill sets. Testers, in particular, are under-represented in many organizations. Most software leaders know that efficiency is the only way to get ahead–or even, stay afloat.

R Tools for Visual Studio, the open-source extenstion to Visual Studio that provides an IDE for the R language, has been upgraded to include several new features.  The latest update, RTVS 0.3, now includes: An R package manager, allowing you to review, install, and uninstall packages using a convenient user interface. The Variable Explorer now allows you to open data-frames for viewing in an Excel workbook. New toolbar buttons to run selected code, source the current script, import data from a URL or file, and start/stop a Shiny app. New shortcut keys to open various IDE windows.

Guest blog post by Bob Hayes, PhD, originally posted here. Bob is the Chief Research Officer at AnalyticsWeek. and President of Business Over Broadway. He graduated from Bowling Green State University, and he is currently leaving from Seattle. Data scientists rely on tools/products/solutions to help them get insights from data. Gregory Piatetsky of KDNuggets conducts an annual survey of data data professionals to better understand the different types of tools they use. Here are the results of the 2015 survey.

"Beyond improving profits and cutting down on wasted overhead, Big Data in healthcare is being used to predict epidemics, cure disease, improve quality of life and avoid preventable deaths." Big data is everywhere and if you are not already embracing it then you are likely to be left behind. Companies are using big data to be able to target potential clients/customers as well as make decisions that will affect their bottom line. Most companies focus on data that will assist them personally and pay less attention to data that will help their customers.  

The job of data scientist — the quintessential big data job, and the job that was just voted the best job in America for 2016 — is at risk. Data scientists have been called "unicorns" because finding the right person with the right set of skills — including coding, statistics, machine learning, database management, visualization techniques, and industry-specific knowledge — could be practically impossible.  But machine learning and big data itself may be making those unicorns as obsolete as they are mythical. New machine learning algorithms can autonomously analyze data and identify patterns, even interpret the data and produce reports and data visualizations.

What builds brand loyalty? Speaking to consumers with the right message, in the right channel at the right time. Transform your retail and consumer products business through the use of data analytics. Register for IBM Amplify 2016, to be held at the Tampa Convention Center from May 16 to 19.

Learn how to use Cloudera Director, Microsoft Active Directory, and Centrify Express to deploy a secure EDH cluster for workloads in the public cloud.  There are several best practices for deploying a secure Apache Hadoop-powered enterprise data hub (EDH) cluster on Amazon Web Services (AWS), including use of Centrify Express for Linux-to-Active Directory host integration and Microsoft Active Directory as the core integration point for identity, authentication, authorization, and public key infrastructure (PKI). The post How-to: Deploy a Secure Enterprise Data Hub on AWS appeared first on Cloudera Engineering Blog.

This entry was posted in News and tagged , , , , , , , , . Bookmark the permalink.