Big Data News – 05 Sep 2016

Today's Infographic Link: The reality of violence against women

Featured Article
It’s a shocking fact that in the 21st century more than three quarters of a billion people do not have access to enough food to keep themselves healthy, while 30% of the food produced around the world goes to waste.

Top Stories
Here's a look at enterprise plans for implementation of NoSQL databases, and some of the top vendors providing this technology.

We now know that 68 million Dropbox passwords and user names were stolen in 2012. Yes, that's right. Four years ago.

IBM Edge tends to embody much of what makes IBM a company to admire.

NooBaa, a software-defined-storage (SDS) company, announced the availability its frictionless storage solution. NooBaa software allows enterprises with storage capacity challenges to create a secure, high-performance storage solution across any compute resource available.

How will the world function once it has access to a global, interconnected computing environment that touches every device on the planet?

Shifting to cloud application providers requires companies to step carefully. Here are 9 potential problems that can prevent a smooth transition.

The Data Lab MSc is a fully funded challenge-based learning programme, now entering its second year. It aims to tackle the skills gap in data science and provide students with an education geared towards meeting the needs of industry so they learn the skills that organisations actually require. 

The trailer for a big budget movie about the fate of an artificially created human was made by IBM's AI platform Watson — or so IBM's Mr John Smith tells us…

Back in September 2013, Brian Krebs wrote about Social Security fraud at the My Social Security web portal,  The SSA and financial institutions say they are tracking a rise in cases wherein identity thieves register an account at the SSA's portal using a retiree's personal information and have that retiree's benefits diverted to prepaid debit cards that the crooks control … The trouble really began earlier this year, when the Treasury started requiring that almost all beneficiaries receive payments through direct deposit …

Containers are a technology that allows you to stuff more compute workloads onto a single server, giving you the ability to upscale capacity for new compute jobs in a tiny fraction of a second, and Docker is one of the premier open source solutions that have emerged to accommodate containers. Theoretically, Docker containers mean less hardware to purchase, as well as less staff to manage the data center. At first glance, the technology behind containerization sounds a lot like virtual machines, but the two are quite different.

Hewlett Packard Enterprise (NYSE: HPE) announced the availability of HPE Haven OnDemand (HoD) Combinations, a new cloud-based offering built on the HPE Haven OnDemand platform that enables developers to apply the power of machine learning to build next generation applications.

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week. Featured Resources and Technical Contributions Analyzing International Currency Network with DeDaL + Comparing classification algorithms: pluses and minuses  80+ Free Data Science Books  25 Java Machine Learning Tools & Libraries  Essentials of Machine Learning Algorithms (with Python and R Codes)  8 Great Blogs Posted in the last 12 Months  What Can Baby Names Tell Us About Our Narcissism Epidemic? 

DrivenBI SRK, the cloud-native true self-service business intelligence (BI) tool to remove IT complexity for business professionals, introduces its Collaboration Hub.

Guest blog post by Larry Alton Big data is seeping into every facet of our lives. Smart home gadgets are becoming part of the nerve systems of new and remodeled homes, and many renters are demanding these interconnected gadgets from landlords. But nowhere has Big Data created a bigger buzz than in business. Companies of all sizes are collecting data at a seemingly insurmountable rate. Big data is larger than ever before. We've collected more data in the past two years than in the entire history of the human race.

Guest blog post by Vimal Natarajan Introduction The City and County of San Francisco had launched an official open data portal called SF OpenData in 2009 as a product of its official open data program, DataSF. The portal contains hundreds of city datasets for use by developers, analysts, residents and more. Under the category of Public Safety, the portal contains the list of SFPD Incidents since Jan 1, 2003. In my previous post I performed an exploratory time-series analysis on the crime incidents data to identify any patterns.

An archive of all O'Reilly data ebooks is available below for free download. Dive deep into the latest in data science and big data, compiled by O'Reilly editors, authors, and Strata speakers: There are several selections starting from 2012 Ebooks to 2016 Ebooks. 

What are the advantages of different classification algorithms? For instance, if we have large training data set with approx more than 10,000 instances and more than 100,000 features, then which classifier will be best to choose for classification? Xavier Amatriain, PhD in CS, former Professor and coder has answered the question: There are a number of dimensions you can look at to give you a sense of what will be a reasonable algorithm to start with, namely: Number of training examples Dimensionality of the feature space Do I expect the problem to be linearly separable?

Kimera Systems Inc. announced its Nigel™ artificial general intelligence (AGI) technology became a commercially deployable artificial intelligence technology to observe user behavior, comprehend context, and derive a common sense set of actions to apply under specific circumstances.

You have probably heard people talking about the Cloud, but unless you use it yourself you may be unsure why it matters. Cloud services are the catch-all term for outsourced online hosting and storage services, operated through third party software that you manage yourself or employ an agent to look after. Instead of maintaining expensive servers and drives yourself (along with the specialist staff to manage them) you can instead subscribe to a Cloud service and let their secure servers handle your data and web services.

In this special guest feature, Guy Yehiav, CEO at Profitect, discusses how prescriptive anaytics holds the keys to efficiency with the least amount of risk and the fastest time to value.

Our friends over at Boston University's Master of Science in Computer Information Systems Online Program put together the compelling infographic below that highlights why Business Intelligence (BI) is the key to competitive advantage for enterprises today.

For those of you who haven't heard of Facebook's live video, check out the live map. Essentially, Facebook enables a live stream for all users, both private and businesses. The live video opportunity Well, since this is a (relatively) new feature to Facebook, and one that is doing well, Facebook is pushing hard for it.

In this Digital age, Superheroes are becoming more popular…. Iron Man, The Hulk, Thor, Captain America, Avengers Superman, Batman, Spider man, and many more… There are a lot of superheroes and it is…

How well do your employees perform? If you're in the widget-making business, that's a relatively easy question to answer. But in today's highly diversified and increasingly digital economy, assessing the performance of your current or prospective labor pool increasingly requires the assistance of big data collection and analytics. Since the dawn of free market capitalism, employers have struggled to find and retain the best employees.

The IBM Virtual Finance Forum 2016 is designed to help finance organizations apply analytics in management decision making to anticipate and shape business outcomes more effectively. With online sessions and keynotes, finance professionals will learn how IBM helps their organizations apply cognitive analytics to improve financial and operational performance management.

DevOps is a term that comes full of controversy. A lot of people are on the bandwagon, while others are waiting for the term to jump the shark, and eventually go back to business as usual. Regardless of where you are along the specturm of loving or hating the term DevOps, one thing is certain. More and more people are using it to describe a system administrator who uses scripts, or tools like, Chef, Puppet or Ansible, in order to provision infrastructure. There is also usually an expectation of being able to deliver this in 100% cloud, or hybrid cloud environments.

We just started in this article to provide answers to one of the largest collection of data science job interview questions ever published, and we will continue to add answers to most of these questions. Some answers link to solutions offered in my Wiley data science book: you can find this book here. The 91 job interview questions were originally published here with no answers, and we recently added 50 questions to identify a true data scientist, in this article.

Guest blog post by Klodian Original post is published at DataScience+ Recently, I become interested to grasp the data from webpages, such as Wikipedia, and to visualize it with R. As I did in my previous post, I use rvest package to get the data from webpage and ggplot package to visualize the data. In this post, I will map the life expectancy in White and African-American in US. Load the required packages.

The latest in a string of hacks against retail point-of-sale systems has hit the operator of a cloud-based service with about 38,000 business clients. Montreal-based Lightspeed reported the breach on Thursday and said it affected a system that retailers can use from tablets, smartphones and other devices. The incident occurs as a growing number of retailers and hotels have been targeted by hackers, who typically install malware into the point-of-sale systems to steal credit card numbers.

The Bureau of Labor Statistics middling monthly employment report in advance of the U.S. Labor Day observance again reflects the impact of technology and globalization on the American workforce and the resulting demand dichotomy that places the greatest emphasis on low-paying service jobs at one end of the employment spectrum and relatively high-paying technical jobs like database administrators on the other.

Hazelcast, a leading open source in-memory data grid with 500,000 installed nodes and over 16 million node starts per month, announced the general availability of Hazelcast 3.7.

While the broad outlines of the cloud are coming into view, the exact architecture and the host's location are still very much "up in the air."

Services like Google Analytics are great for amassing key data to help you make the most of your web efforts, but zeroing in on the parts that matter most can be a time-consuming challenge. On Friday, Google added a new feature to its analytics service that taps AI to surface insights automatically. Now available in the Assistant screen in the Google Analytics mobile app, the new automated insights feature "lets you see in 5 minutes what might have taken hours to discover previously," wrote Ajay Nainani, product manager for Google Analytics, in a blog post.

Apache Hadoop is a key technology for gaining business insights from your Big Data, but the penetration into enterprises is shockingly low. In fact, Apache Hadoop and Big Data proponents recognize that this technology has not yet achieved its game-changing business potential. In his session at 19th Cloud Expo, John Mertic, director of program management for ODPi at The Linux Foundation, will explain why this is, how we can work together as an open data community to increase adoption, and the importance of open source-based Big Data standardization to help unlock more business value for Apache Hadoop initiatives.

For about three years now, telemetry has been gathered for professional basketball games in the US by SportVU for the NBA. Six cameras track the on-court position of the players and the ball, with a resolution of 25 samples per second. Combine this movement data with NBA play-by-play data (players, plays, fouls, and points scored — data sadly no longer made available by the NBA), and you have a rich data set for analysis.

Michael Wallman is a senior consultant with AWS ProServ It's easy to understand network patterns in small AWS deployments where software stacks are well defined and managed. But as teams and usage grow, its gets harder to understand which systems communicate with each other, and on what ports. This often results in overly permissive security groups. In this post, I show you how to gain valuable insight into your network by using Amazon EMR and Amazon VPC Flow Logs. The walkthrough implements a pattern often found in network equipment called 'Top Talkers', an ordered list of the heaviest network users, but the model can also be used for many other types of network analysis.

By the year 2030, artificial intelligence (A.I.) will have changed the way we travel to work and to parties, how we take care of our health and how our kids are educated. That's the consensus from a panel of academic and technology experts taking part in Stanford University's One Hundred Year Study on Artificial Intelligence. Focused on trying to foresee the advances coming to A.I., as well as the ethical challenges they'll bring, the panel yesterday released its first study. The 28,000-word report, "Artificial Intelligence and Life in 2030," looks at eight categories — from employment to healthcare, security, entertainment, education, service robots, transportation and poor communities — and tries to predict how smart technologies will affect urban life.

HudsonAlpha Institute for Biotechnology leverages modern IT infrastructure and big-data analytics to power a pioneering research project incubator and genomic medicine innovator.

5G NR and court cases are in the news this week.

Bad presentations — we've all seen them. Over-ridden with bullet points full of industry-speak. A hodge-podge of extraneous content. No structure, and certainly no focus. They are the very thing everyone loathes. No matter how hard you try, your audience won't follow-through with your proposal. And worse, reputations are tarnished — sometimes irreparably damaged –…

Our guest today is Michael Cuthbert, an associate professor of music at MIT and principal investigator of the Music21 project, which we focus our discussion on today. Music21 is a python library making analysis of music accessible and fun. It supports integration with popular formats such as MIDI, MusicXML, Lilypond, and others. It's also well integrated with The Elvis Project, enabling users to import large volumes of music for easy analysis.

This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach. Most CISOs receive a rude awakening when they encounter their first major security issue in the cloud.

Charles and Miranda first met in art school in 1979. Over time, they realized a shared passion for handwork and the elegance of handmade objects for the home. Today, Charles Shackleton Furniture and…

This week, however, Asteria started taking pre-orders for a device that flips the AI paradigm on its head

At the VMworld 2016 conference this week, it became more apparent just how comprehensively VMware has changed its approach to desktop computing.

This entry was posted in News and tagged , , , , , , , , , , , . Bookmark the permalink.