Big Data News – 23 Feb 2017

Today’s Infographic Link: World Octopus Day

Featured Article
NASA’s Lessons Learned database is a vast, constantly updated collection knowledge and experience from past missions, which it relies on for planning future projects and expeditions into space.

Top Stories
If self-service business intelligence initiatives are on your agenda, follow these 10 best practices for ensuring proper governance. is one of an increasing number of players that aims to offer sales teams tools to make them more efficient and effective. In its case, InsideSales came about through the post-graduate research of co-founder Dave Elkington. As Elkington studied artificial intelligence, he soon came to realize that A.I. has existed for decades — the math that is behind A.I. was used back in the mid-20th century by researchers at companies such as IBM. What is different today — and what gives the power to the disruptive companies to undermine their more conservative competitors — is the access to data. As Elkington sees it, Netflix's ability to put Blockbuster out of business was a direct result of Netflix's intentional strategy to amass information about its customers and, in doing so, to give its own predictive algorithms the best possible source data to tune its suggestions.

The big data pipeline is getting more crowded. Learn how to improve your company's big data throughput when going over the public internet.

With its Ryzen launch, AMD is avoiding making a strategic mistake it has made several times in the past, says analyst Rob Enderle.

If you're an Excel user (or any other spreadsheet, really), adapting to learn R can be hard. As this blog post by Gordon Shotwell explains, one of the reasons is that simple things can be harder to do in R than Excel. But it's worth perservering, because complex things can be easier. While Excel (ahem) excels at things like arithmetic and tabulations (and some complex things too, like presentation), R's programmatic focus introduces concepts like data structures, iteration, and functions. Once you've made the investment in learning R, these abstractions make reducing complex tasks into discrete steps possible, and automating repeated similar tasks much easier.

Apple this week took administrative control of the domain, the last notable web address it did not govern that users could have linked with its online sync and storage service. According to WHOIS searches today, Apple acquired control of on Tuesday. Apple already ruled the primary top-level domains for iCloud, the cross-device, cross-OS service that stores files generated by iOS and macOS, and more importantly, synchronizes everything from Safari browser bookmarks to photographs between iPhones, iPads and Macs.

IoT networks are unique: They will be worldwide, required to function where no established network exists, and have stringent power requirements.

Amazon Web Services is the consensus leader of the IaaS public cloud computing market according to industry watchers, but they credit Microsoft for closing the gap with Azure and say Google with its Cloud Platform has made considerable strides as well.

Prescriptive analytics (optimization) is a sophisticated analytics technology. It can deliver great business value by helping decision makers handle the tough trade-offs that arise when limited resources force choices among options. Optimization was traditionally applied by Operations Research professionals to solve operational problems, such as route optimization and logistics planning. With the advent of new technologies that make it possible to model larger, enterprise-wide problems, and provide broad support for what-if analyses, Prescriptive analytics now enables a new class of business analytics applications.

We will see shifts in ransomware, including how attackers target victims, who they'll target, and the role IoT will play in ransomware.

Using IBM Counter Fraud Management (CFM), an insurer can improve the operational effectiveness of its fraud prevention program and drive impressive fraud savings. IBM Watson helps insurers detect, respond and stop fraud with the ability to tap unstructured data.

In today's digital transformation, achieving the desired user outcome is now the driver of technology decisions, rather than the other way around.

Did political analytics firm Cambridge Analytica exaggerate the role analytics played in helping Donald Trump win? Crowdskout CEO Zack Christenson separates election tech facts from fiction.

Governing bodies are pushing for an open standard to make it simpler for e-signatures to move through a process spanning apps from multiple vendors.

What are the limits of AI? And how do you go from managing data points to injecting AI in the enterprise?

Alex Sakaguchi, director, solutions marketing, Veritas: Company is committed to extend reach of software across as many relevant clouds as possible.

Before CDH 5.10, every CDH cluster had to have its own Apache Hive Metastore (HMS) backend database. This model is ideal for clusters where each cluster contains the data locally along with the metadata. In the cloud, however, many CDH clusters run directly on a shared object store (like Amazon S3), making it possible for the data to live across multiple clusters and beyond any cluster's lifespan. In this scenario clusters need to regenerate and coordinate metadata for the underlying shared data individually.

Mandy Chessell, Distinguished Engineer & Master Inventor discuses four key perspectives on Data Lakes and introduces a new video series.

The Informatica Axon offering is the first time Informatica will provide apps specifically for IT pros asked to be the stewards of data governance.

Google recently added support for the NVIDIA Tesla K80 GPU in the Google Compute Engine and Cloud Machine Learning to improve processing power for deep learning tasks.

NoSQL uses procedural implementation-specific structures expressed in a JSON format to represent its data model. ECMA International Standards body developed JavaScript to handle tasks in the browser. They also provided an extension to JavaScript to develop a lightweight language for interchanging data over the Internet called JavaScript Object Notation (JSON). The downside of JSON is that it lacks the capabilities to provide referential integrity. These data models are neither interoperable nor standardized. Which means, no data portability. JSON doesn't provide any ability to resolve name space ambiguity in which your data is defined, or the structure and data types.

Chief Data Officers, in particular, will want to take note of Generation Z as they begin to grow up, because many of their attitudes and behavior toward data are shifting from that of previous generations. Those that prepare for Gen Z early and build a relationship with them based on good data practices may find themselves in optimal position as this new group's influence and purchasing power increases.

It's hard to find a company that does not have some form of a hybrid (cloud and on-premise) ERP system. For most, that happened by accident. Someone in the organization bypassed IT and bought a cloud service to fill a need more quickly than they could with an on-premise solution., for example, has often been the start of a company's march to a hybrid environment. Cloud applications can be relatively easy, low-cost solutions, but they do introduce new complexities when they need to be integrated with on-premise ERP systems and databases, or with each other. Ensuring that cloud and on-premise systems play nice together is just one part of the hybrid challenge. Making the right decisions about what will be in the cloud and what stays in-house is the other.

In anticipation of her upcoming conference presentation, Redefining Analytics for Marketing, at Predictive Analytics World San Francisco, May 14-18, 2017, we asked Jennifer Bertero, VP, Business Analytics at CA Technologies, a few questions about her work in predictive analytics. Q: In your work with predictive analytics, what behavior or outcome do your models predict?

It seems those interested in a career in cybersecurity should have options. But problems in education, job listings and realistic expectations abound.

Bloomberg has created an open source way to use machine-learning models to weight searches, then add their values to the Solr search engine.

Machine learning can automate the handling of huge troves of data to help companies make and save money. However, they're not without pitfalls, as the real estate tech company Redfin learned. As Redfin began building its own machine-learning capabilities, it ran into a problem: Employees weren't using them. Bridget Frey, the firm's CTO, said in an interview that there was a key reason for that: At first, Redfin didn't leave room in these systems for the real estate agents who were supposed to use them to make modifications. For example, a Listings Matchmaker feature generated a list of personalized recommendations for home buyers, based on their interests. In its initial iteration, agents weren't able to add recommendations they thought would be useful.

I'm a big fan of action movies. In particular I like the Mafia-style genre which often has a big, aggressive antagonist playing tough over a weaker opponent. You know the storyline: Unsuspecting individual gets trapped into an ever-escalating situation where the odds just keep getting worse. Antagonist takes advantage of said individual to keep tightening the screws and making more and more difficult demands. I was thinking about this sort of storyline the other day when I heard about some new licensing policies that Oracle — the technology industry's best analog for Al Capone — had announced. Specifically, the licensing applies to those Oracle customers running databases on Amazon Web Services (AWS) or Microsoft Azure.

Google's Go language was recently chosen as Tiobe's programming language of 2016, based on its rapid growth in popularity over the year, more than twice that of runners-up Dart and Perl. Tiobe's language index is based on the "number of skilled engineers worldwide, courses, and third-party vendors," using the results of multiple search engines.

Microsoft has launched Project Sangam, a cloud service integrated with LinkedIn that will help train and generate employment for middle and low-skilled workers. The professional network that was acquired by Microsoft in December has been generally associated with educated urban professionals but the company is now planning to extend its reach to semi-skilled people in India. Having connected white-collared professionals around the world with the right job opportunities and training through LinkedIn Learning, the platform is now developing a new set of products that extends this service to low- and semi-skilled workers, said Microsoft CEO Satya Nadella at an event on digital transformation in Mumbai on Wednesday.

Cybersecurity Ventures recently announced their Q3 2016 Cybersecurity 500, a directory of the hottest cybersecurity companies to watch this year.

Radiohead is known for having some fairly maudlin songs, but of all of their tracks, which is the most depressing? Data scientist and R enthusiast Charlie Thompson ranked all of their tracks according to a "gloom index", and created the following chart of gloominess for each of the band's nine studio albums. (Click for the interactive version, crated with with highcharter package for R, which allows you to explore individual tracks.)  

Machine learning can automate the handling of huge troves of data to help companies make and save money. However, they're not without pitfalls, as the real estate tech company Redfin learned. As Redfin began building its own machine-learning capabilities, it ran into a problem: Employees weren't using them. Bridget Frey, the firm's CTO, said in an interview that there was a key reason for that: At first, Redfin didn't leave room in these systems for the real estate agents who were supposed to use them to make modifications. For example, a Listings Matchmaker feature generated a list of personalized recommendations for homebuyers, based on their interests. In its initial iteration, agents weren't able to add recommendations they thought would be useful.

Drones and balloons can provide backup services in case of disaster; supplement services at high use times; offer service economically to rural areas.

Worldwide spending on public cloud services and infrastructure will reach $122.5 billion in 2017, representing an increase of 24.4% over 2016. As enterprises across the world increased investment, overall public spending is expected to surge 21.5% by 2020, nearly seven times the rate of overall IT spending growth. By 2020, IDC research forecasts public cloud spending will reach $203.4 billion worldwide. Software as a service (SaaS) will remain the dominant cloud computing type, capturing nearly two-thirds of all public cloud spending in 2017 and roughly 60% in 2020. According to IDC, SaaS spending, which is comprised of applications and system infrastructure software (SIS), will, in turn, be dominated by applications purchases, which will make up more than half of all public cloud spending throughout the forecast period.

Development of data infrastructure on cloudy, hyperconverged lines produces an environment that is scalable with reduced operational responsibilities.

The 5G standard, which promises to wipe the gap between wireless and wired connections, will be a huge step forward. Carriers weigh in on their plans.

US companies that don't have a presence in Europe still have to be sure that they comply with the EU's privacy laws regarding personally identifiable data.

Intel aims to accelerate IoT adoption by making it affordable to deploy a turnkey endpoint with built-in sensor and networking technologies.

Additions to the Cisco Digital Network Architecture (DNA) are the latest in a Cisco effort to unify management of networks.

As consumers and employees, we tend to be haughty about our security IQ. Nearly two-thirds say they don't believe they were a victim of a cyberattack. CEO explains the advantages and potential pitfalls of DevOps and how the company used big data and AI to create the Apocalypse Index, a real-time chart forecasting the end of the world.

A number of vendors are trying to position themselves as the low-code app development platform of choice. Appian is one leader.

The big data pioneer on how to use machine learning to mine gold from your company's data stores.

Talent Analytics, Corp. has a unique approach to workforce predictive analytics.  At our firm, we measure success by how our projects quantifiably benefit the Line of Business.  We watch it, track it, and report success.  Our algorithms get better and smarter using the best Data Science methods available. I've been involved in the predictive workforce…

Manuel Martin Marquez, lead data scientist at Cern, on how the research lab is using machine learning to mine value from the huge amounts of data it generates

This entry was posted in News and tagged , , , , , , . Bookmark the permalink.