Big Data News – 25 Jul 2016

Today's Infographic Link: Are You Addicted to Your Work?

Featured Article
The online world isn't as simple as we've thought it to be. Behind the seemingly quiet and vast space of nothingness, huge amounts of data are uploaded and downloaded in fractions of a second. Data science does not only keeps track of these numbers, but also attempts to analyze and organize them. Algorithms are created to keep tabs on searches in search engines, analyze user data preference, and so on and so forth. The demand for qualified data scientists have become very high.

Top Stories
Watson's AI will help you find shoes at Macy's, Workday buys big data analytics company, Google releases machine learning APIs, and more in our Big Data Roundup for the week ending July 24.

Cloudera announced the general availability of Cloudera Navigator Optimizer, alongside the production release of Cloudera Enterprise 5.8.

insideBIGDATA is nearing the completion of its 2016 audience survey and we'd like to remind any of our remaining loyal readers to give us your thoughts so we can learn what's important to you.

Competitors can't compete, and leaders can't lead if they don't know the rules of the game. Understanding how points are scored, and what is required to win is key to any competition. In the age of digital transformation there are key rules to learn: Data is the modern commercial battlefield Information dominance is the strategic goal

If you are within a stones throw of the DevOps marketplace you have undoubtably noticed the growing trend in Microservices. Whether you have been staying up to date with the latest articles and blogs or you just read the definition for the first time, these 5 Microservices Resources You Need In Your Life will guide you through the ins and outs of Microservices in today's world.

Once upon a time in IT, using open source simply meant Linux instead of Windows, or maybe MySQL instead of Oracle. Now, there is such a huge diversity of open source tools, and almost every leading…

This digest provides an overview of good resources that are well worth reading. We'll be updating this page as new content becomes available, so I suggest you bookmark it. Also, expect more digests to come on different topics that make all of our IT-hearts go boom!

"We've discovered that after shows 80% if leads that people get, 80% of the conversations end up on the show floor, meaning people forget about it, people forget who they talk to, people forget that there are actual business opportunities to be had here so we try to help out and keep the conversations going," explained Jeff Mesnik, Founder and President of ContentMX, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.

I wanted to gather all of my Internet of Things (IOT) blogs into a single blog (that I could later use with my University of San Francisco (USF) Big Data "MBA" course). However as I started to pull these blogs together, I realized that my IOT discussion lacked a vision; it lacked an end point towards which an organization could drive their IOT envisioning, proof of value, app dev, data engineering and data science efforts. And I think that the IOT end point is really quite simple…

Node.js and io.js are increasingly being used to run JavaScript on the server side for many types of applications, such as websites, real-time messaging and controllers for small devices with limited resources. For DevOps it is crucial to monitor the whole application stack and Node.js is rapidly becoming an important part of the stack in many organizations. Sematext has historically had a strong support for monitoring big data applications such as Elastic (aka Elasticsearch), Cassandra, Solr, Spark, Hadoop, and HBase, as well as more traditional databases, web servers like Nginx, Nginx Plus and Apache, Java applications, cache servers like Redis and Memcached, messaging middleware like everyone's darling Kafka, etc. With such rapid adoption of Node.js and now io.js, we'd be remiss not to add performance monitoring, alerting, and anomaly detection for them in SPM!

Actian Corporation has announced the latest version of the Actian Vector in Hadoop (VectorH) database, generally available at the end of July. VectorH is based on the same query engine that powers Actian Vector, which recently doubled the TPC-H benchmark record for non-clustered systems at the 3000GB scale factor (see tpc.org/3323).

WebRTC is bringing significant change to the communications landscape that will bridge the worlds of web and telephony, making the Internet the new standard for communications. Cloud9 took the road less traveled and used WebRTC to create a downloadable enterprise-grade communications platform that is changing the communication dynamic in the financial sector. In his session at Things Expo, Leo Papadopoulos, CTO of Cloud9, discussed the importance of WebRTC and how it enables companies to focus on building intellectual property into their platforms that support customer needs, while also providing the performance, service, and support levels expected by Fortune 100 companies.

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. Announcement Explore how government agencies are using predictive analytics to solve real-world problems at the sixth annual Predictive Analytics World for Government conference, October 17 through 20 in Washington, D.C. At the largest and only vendor-neutral conference for government data science, see how better data analysis can create stronger, more resilient government agencies. Register here.

This is another topic that has taken me a long time to write, but several conversations with Peter Burris(@plburris) from Wikibon finally helped me to pull this together. Thanks Peter! I've struggled to understand and define the Intellectual Capital (IC) components — or dimensions — of the new, Big Data organization; that is, what are the new Big Data assets that an organization needs to collect, enrich and apply to drive business differentiation and competitive advantage?

Adding public cloud resources to an existing application can be a daunting process. The tools that you currently use to manage the software and hardware outside the cloud aren't always the best tools to efficiently grow into the cloud. All of the major configuration management tools have cloud orchestration plugins that can be leveraged, but there are also cloud-native tools that can dramatically improve the efficiency of managing your application lifecycle.

In this special guest feature, Rodger Howell, an Advisory principal for PwC's Strategy, provides a number of examples of the ways in which organizations that have implemented data analytics practices to increase revenue, boost operating excellence, improve compliance and streamline financial reporting.

The infographic below summarizes the results of a new research report surveying 235 business executives – "Outlook on Artificial Intelligence in the Enterprise 2016" sponsored by Narrative Science.

Identity is in everything and customers are looking to their providers to ensure the security of their identities, transactions and data. With the increased reliance on cloud-based services, service providers must build security and trust into their offerings, adding value to customers and improving the user experience. Making identity, security and privacy easy for customers provides a unique advantage over the competition.

When the Internet of Things (IoT) started to emerge as a popular topic, I had to stop and ask myself if I was once again going to provide commentary on this emerging field. I enjoy exploring new technology shifts and illustrating how they can benefit various industries and businesses. It's what I've done for the past 20 years through Java, XML, Web Services, SOA, Cloud and DevOps. However, every time I started writing on IoT I seemed to run into the same conundrum; am I commenting on this to jump on the hype bandwagon or because I see a need to represent the pragmatics of implementing and adopting this technology.

For basic one-to-one voice or video calling solutions, WebRTC has proven to be a very powerful technology. Although WebRTC's core functionality is to provide secure, real-time p2p media streaming, leveraging native platform features and server-side components brings up new communication capabilities for web and native mobile applications, allowing for advanced multi-user use cases such as video broadcasting, conferencing, and media recording.

This is a no-hype, pragmatic post about why I think you should consider architecting your next project the way SOA and/or microservices suggest. No matter if it's a greenfield approach or if you're in dire need of refactoring. Please note: considering still keeps open the option of not taking that approach.

With over 720 million Internet users and 40–50% CAGR, the Chinese Cloud Computing market has been booming. When talking about cloud computing, what are the Chinese users of cloud thinking about? What is the most powerful force that can push them to make the buying decision? How to tap into them? In his session at 18th Cloud Expo, Yu Hao, CEO and co-founder of SpeedyCloud, answered these questions and discussed the results of SpeedyCloud's survey.

Sharding has become a popular means of achieving scalability in application architectures in which read/write data separation is not only possible, but desirable to achieve new heights of concurrency. The premise is that by splitting up read and write duties, it is possible to get better overall performance at the cost of a slight delay in consistency.

Redis is not only the fastest database, but it is the most popular among the new wave of databases running in containers. Redis speeds up just about every data interaction between your users or operational systems. In his session at 19th Cloud Expo, Dave Nielsen, Developer Advocate, Redis Labs, will share the functions and data structures used to solve everyday use cases that are driving Redis' popularity.

Before becoming a developer, I was in the high school band. I played several brass instruments – including French horn and cornet – as well as keyboards in the jazz stage band. A musician and a nerd, what can I say? I even dabbled in writing music for the band. Okay, mostly I wrote arrangements of pop music, so the band could keep the crowd entertained during Friday night football games.

SYS-CON Events announced today that Isomorphic Software will exhibit at DevOps Summit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Isomorphic Software provides the SmartClient HTML5/AJAX platform, the most advanced technology for building rich, cutting-edge enterprise web applications for desktop and mobile. SmartClient combines the productivity and performance of traditional desktop software with the simplicity and reach of the open web.

With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things.

Machine Learning represents the new frontier in analytics, and is the answer of how many companies can capitalize on the data opportunity. Machine…

MapR Technologies, Inc., provider of the Converged Data Platform, announced the immediate availability of the MapR Risk Management Quick Start Solution for Financial Services powered by the MapR Converged Data Platform.

SYS-CON Events announced today the Enterprise IoT Bootcamp, being held November 1-2, 2016, in conjunction with 19th Cloud Expo at Things Expo at the Santa Clara Convention Center in Santa Clara, CA. Combined with real-world scenarios and use cases, the Enterprise IoT Bootcamp is not just based on presentations but with hands-on demos and detailed walkthroughs. We will introduce you to a variety of real world use cases prototyped using Arduino, Raspberry Pi, BeagleBone, Spark, and Intel Edison. You will get a thorough overview of enterprise cloud technologies such as AWS IoT, Azure IoT Suite, and IBM Watson IoT, which play an important role in IoT architecture. The immersive 2-day workshop will provide you with everything you wanted to know about Internet of Things.

Much of IT terminology is often misused and misapplied. Modernization and transformation are two such terms. They are often used interchangeably even though they mean different things and have very different connotations. Indeed, it is somewhat safe to assume that in IT any transformative effort is likely to also have a modernizing effect, and thus, we can see these as levels of improvement efforts. However, many businesses are being led to believe if they don't transform now they risk becoming irrelevant when they would be equally well-off to simply modernize existing IT services saving millions over transformation.

News this week included autonomous vehicle vulnerabilities, a Google Fiber delay, updating pen testing, and combining augmented and virtual realities.

Fraud detection and prevention are critical, but the banking industry needs to see beyond fraud. By effectively using cybersecurity assessment tools, banking regulators and institutions along with the financial services industry need to intelligently adapt and be ever-vigilant against the rise in cyber threats against them and their customers.

How to balance data safety with innovative big data expansion was at issue at an MIT symposium where the chief data officer role was considered.

There's been a lot of discussion about how likely artificial intelligence applications are to destroy jobs, but one expert says the impact will be small and beneficial.

I'm Rocky DeStefano Cloudera's Subject Matter Expert for cybersecurity and for the majority of the last two decades I've explored in depth both the security vendor space and the operational side of information security for organizations in the Federal and enterprise space. My primary goal has been to detect malicious activity faster, with more accuracy and ultimately understand and communicate the impact with how the business operates. The experiences I've had led me to try and create solutions through technology and through leading security operations teams.

DockerCon sailed through Seattle recently, leaving behind in its wake a new swath of rapid adopters plus a trail of related company and product announcements. Docker itself produced perhaps the most exciting announcements of all with the launch of its DockerStore, a searchable marketplace for validated software and tools used in the Docker format, plus the launch of version 1.12 of its software, currently in public beta.

When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby.




TPC Database Benchmarks As most readers of data warehouse related blogs will no doubt know, the Transaction Processing Performance Council (TPC) define database benchmarks to allow comparisons to be made between different technologies. Amongst said benchmarks is the TPC-H ad hoc decision support benchmark, which has been around since 1999 with the latest version being v2.17.1.This is the most relevant TPC benchmark for those of us focussed on data warehouse platforms.

A look at the most significant developments in cooling for data facilities.

Opinion: Ken Parnham, General Manager Europe at Near, discusses the the business opportunities of location intelligence.

by Joseph Rickert Last week in a webinar, Burtch Works, an Illinois based executive recruiting firm that specializes in finding analytic talent, released the results of their third annual survey of…

This week is an insightful discussion with Claudia Perlich about some situations in machine learning where models can be built, perhaps by well-intentioned practitioners, to appear to be highly predictive despite being trained on random data. Our discussion covers some novel observations about ROC and AUC, as well as an informative discussion of leakage.

Salesforce announces that the database platform it gained when it acquired PredictionIO earlier this year is now an open source Apache project.

By: Sean Robinson, Program Chair, Predictive Analytics World for Government In anticipation of their upcoming conference co-presentation, Characteristics for Those Claiming Social Security Benefits Early, at Predictive Analytics World for Government, October 17-20, 2016, we asked Dae Park, Assistant Director at Government Accountability Office (GAO), and Vijay D’Souza, Director at Government Accountability Office (GAO), a few questions about their work in predictive analytics.

The Dean of the University of San Francisco School of Management, Elizabeth Davis, recently asked me to sit on a Big Data panel at the Direct Sales Association conference. I was given a 5-minute slot to "demystify" Big Data to a non-technical group of about 1,000 people; to help them understand where and how this thing called "Big Data" could help them.

Over the past 30+ years, businesses have spent billions on talent assessments. Many of these are now being used to understand job candidates. Increasingly, businesses are asking how (or if) a predictive talent acquisition strategy can include the use of pre-hire assessments? As costs of failed new hires continue to rise, recruiters and hiring… The post Are Pre-hire Talent Assessments Part of a Predictive Talent Acquisition Strategy? appeared first on Predictive Analytics Times.

In the latest reflection of cloud computing's impact on the IT services market, outsourcing consultancy Information Services Group (ISG) for the first time expanded its quarterly market index to look specifically at the as-a-service segment of IT and business process services industry.

In a survey, cloud security broker vendor CipherCloud found that 86 percent of cloud applications used at workplaces are unsanctioned. That's a pretty big percentage. Obviously, the security vendors have an incentive to raise such fears about shadow IT, so take this claim with much salt. However, the issue merits attention.

Tips by 4 DataHack Winners Nalin Pasricha, DataHack Rank 1 Nalin is an investment banker turned data scientist who currently works as an independent consultant. He has participated in 17 hackathons at DataHack. He won Data Hackathon 3.x and emerged as the 1st Runner Up in Black Friday DataHack. Here's what Nalin has to say: Our mind works subconsciously at night on our problems in a very powerful manner.

Unlike traditional application programming, where API functions are changing every day, database programming basically remains the same. The first version of Microsoft Visual Studio .NET was released in February 2002, with a new version released about every two years, not including Service Pack releases. This rapid pace of change forces IT personnel to evaluate their corporation's applications every couple years, leaving the functionality of their application intact but with a completely different source code in order to stay current with the latest techniques and technology.

In my recent blog, Marrying Kalman Filtering & Machine Learning, we saw the merger of Bayesian exact recursive estimation (algorithm for which is Kalman Filter/Smoother in the linear, Gaussian case) and Machine Learning. We developed a solution called Kernel Projection Kalman Filter for business applications that require static or dynamical, dynamical or time-varying dynamical, linear or non-linear Machine Learning, i.e., pretty much all applications – therefore, Kernel Projection Kalman Filter is a "universal" solution . . .

Can Pre-hire Talent Assessments Be a Part of a Predictive Talent Acquisition Strategy? Over the past 30+ years, businesses have spent billions on talent assessments. Many of these are now being used to understand job candidates. Increasingly, businesses are asking how (or if) a predictive talent acquisition strategy can include the use of pre-hire assessments? As costs of failed new hires continue to rise, recruiters and hiring managers are looking for any kind of pre-hire information to increase the probability of making a great hire.
This entry was posted in News and tagged , , , , , , , , , , , , , , . Bookmark the permalink.