Big Data News – 04 Aug 2016

Featured Article
How thirsty is your Apple Music collection? How much water gets used up when you send an email through iCloud? Not a great deal, but the truth is the data centers that drive services like these are incredibly thirsty creatures. Drop by drop Every online photo, all those Apple Maps requests, Siri interrogations, FaceTime chats, Apps downloads and iMessage exchanges all use drops of water. In most cases the data servers enabling all these Apple services are kept cool by pumping water through the systems.

Top Stories
Organizations are finding that hiring qualified Data Scientists is a real challenge. Experienced Data Scientists are expensive and are usually employed elsewhere. This high demand, low supply economics is leading to a situation of the 'haves' versus the 'have-nots', where the larger, financially rich organizations in the 'sexy' industries are most capable of attracting and…

Machine learning has readily improved the way we interact with the internet. One such example would be the spam filter which readily separates unwanted emails from an expansive list of received snippets. However, it is quite intriguing to understand the modus operandi involved in the functioning of these spam filters as segregating mails based on the user ID isn't a plausible option. Moreover, most spams are directed from legit email IDs which are in turn hacked by a third party phishing body.

Zoho has revamped its namesake CRM application to embed support for email, social media, live chat, and phone communications.

IT and business executives want to simplify their company's data pipeline, but many execs grow frustrated with the data gathering and analysis process.

How does the Elite 100 come together, and how do the editors chose the winners? Editorial Director Susan Nunziata spoke to Senior Editor Sara Peters about the list and what goes into selecting and ranking so many deserving recipients.

Research from the World Travel and Tourism Council shows big data is transforming the global tourism industry. According to the WTTC whitepaper, some of the changes are beneficial to the industry, while others are discouraging some people from traveling.

In this special guest feature, Mike Hoskins, CTO at Actian, offers some hard won tips for how to avoid the immaturity "cost iceberg" that hampers Hadoop adoption today.

Cloudera, the global provider of the data management and analytics platform built on Apache Hadoop and the latest open source technologies, and Centrify, a leader in securing enterprise identities, announced that Centrify has joined the Open Network Insight (ONI) project.

Brenda Stallings turns the typical tech entrepreneur story on its head.

The Startup Summit brings together startups, world leading entrepreneurs and some of the biggest companies on the planet, all under one roof. Past Startup Summits have seen companies like Ebay, Google, Skyscanner and Fanduel in attendance and the 5th annual event is shaping up to be the biggest and best yet. Described as one of the UK's most exciting Startup event for entrepreneurs, the Startup Summit brings together world leading entrepreneurs from across the globe, to share ideas, inspiration and advise you on how to start and grow your business.

Data scientists are the new corporate rock stars. Companies in all industries know they need a strong bench of data science talent to have any hope of becoming the next Uber or Netflix. With more demand than supply, here's what you need to know about recruiting the best and brightest.

We are in the midst of a technological shift. The number of devices, data from those devices, and security threats both to those devices and the networks that connect them, have all increased so exponentially that agency CIOs and CISOs are suffering from overload. Despite the fact that they have tools in place to help them address this, they are still struggling. Here's why: Rule-based systems, while necessary, are not sufficient to handle the volume of data or deal with the unknown-unknowns. Statistical computing at-scale must be leveraged to decrease overall risk. The "at-scale" part is very important.

Given the pace at which big data software is released, coupled with the sheer volume of data under management, the big data market is ripe for massive security breaches. It's only a matter of time. In fact, as a Gartner survey last year uncovered, very few companies have taken security seriously for essential infrastructure like Hadoop. At that time, a mere 2 percent of respondents cited Hadoop security as a significant concern, causing Gartner analyst Merv Adrian to exclaim, "The nearly non-existent response to the security issue is shocking."

News: IoT connections will be more secure but Cortana becomes more hungry for data.

Since the oil revolution, the survival of any business called for informed and well-conceived judgments on how to take advantage of any existing asset inside and outside a corporation. Gradually, the old fashioned approach to the executive decision making process has been replaced by a well rounded method of analysis of existing or produced data to generate useful and actionable insights so as to improve one's business opportunities.  

Ten months after Dropbox first unveiled Paper, the collaborative writing tool entered open beta on Wednesday and is getting mobile versions for iOS and Android. Paper allows teams to work on documents together in the cloud. It makes it easy to add text, images, and embedded videos from YouTube, Google, or Dropbox itself. Users can also add programming code, which gets formatted automatically. And they can create to-do lists and assign tasks on those lists using the @ symbol. Since its debut in private beta, Paper has been used to create more than a million documents for tasks like brainstorming ideas and capturing meeting notes, Dropbox said. Based on lessons learned along the way, Dropbox has improved the software with better tables and image galleries, more powerful search, and notifications via desktop and mobile.

Gartner's first-ever Magic Quadrant for Strategic Corporate Performance Management Solutions positions IBM as a leader in SCPM, placing it farther along the completeness of vision axis than any other organization in the Leaders quadrant. Find out why by exploring the capabilities of IBM's on-premises Cognos TM1 solution and IBM Planning Analytics, its cloud-based counterpart.

Innovations in design thinking motivated by cognitive era advances and enhanced solutions served as the stimulus for enhancing IBM Incentive Compensation Management Version 10 usability and its user experience. Take a look at seven noteworthy highlights inspired by user input.

Clouds are typically classified into four different topologies: private, public, community and hybrid clouds. Let's focus on hybrid clouds: what are the use cases that require this topology, what are the challenges, and who stands to benefit from the hybrid cloud trend? First off, a definition: A hybrid cloud is a topology in which more than one cloud infrastructure is utilized to solve a particular problem.

Despite growing worries among some U.S. workers about the deployment of automation and machine learning, a new survey finds that U.S. technology companies also plan to boost workforce hiring over the next three years. The survey of U.S. tech CEOs released by the consulting firm KPMG found that workforce hiring would likely grow in parallel with "digital labor" through the end of the decade. KPMG reported that hiring at technology companies is expected to jump 6 percent over the next three years.

Drones will play a big role in corporate and civic life. Now, government and industry are working through the technical and regulatory issues.

Failure to invest in these key app development items upfront could result in a substantial amount of lost time and money down the road.

A predictive big data analytics algorithm using a variety of demographic and clinical data points may be helpful for identifying patients at high risk of hospitalization or ED use. Reducing unnecessary emergency department utilization and avoidable hospital admissions is a top priority for many healthcare systems, especially those seeking to cut costs and eliminate waste… The post Predictive Big Data Analytics Identify High-Risk ED Patients appeared first on Predictive Analytics Times.

Advanced analytics is an integral part of Tableau's mission to help people see and understand their data.  Read this paper to learn how an intuitive interface, powerful back end, and statistics integration can provide a strong base for any advanced analytics infrastructure. See the specific steps of how to use Tableau in analytics projects with these advanced capabilities: – Segmentation and cohort analysis  – What-if scenario analysis  – Advanced calculations  – Times Series analysis  – Predictive analysis  – R Integration >> Get the white paper

There are plenty of ways to boost performance than simply throwing more transistors onto the die.

Two new machine learning tools from IBM and Google promise rapid development of speech-based apps and signal continued competition in a fierce market.

We're close to the next release of Apache Flink, the stream processing engine developed by data Artisans. Flink version 1.1.1 will bring new SQL interface for working with streaming data, while bigger changes, like dynamic scaling, are set for the subsequent release. Meanwhile, Internet giants like Alibaba and Netflix are set to share their Flink stories at a conference next month. Flink is one of the upstarts in the battle of the big data frameworks that's currently waging across the Web and in the halls of big data development shops around the world.

Written in R, Python, Perl, C, and JavaScript, performing tasks such as web crawling, encryption, simulation, regression, NLP, or visualizations.  14 Useful Code Snippets Source code for video of 2-D random walk simulations  Web crawler for clustering 2,500 data science websites  Python code for our TwitterApp (API) extracting fastest growing Twitter profiles  Palindrome String Detection in R  Credit card number and password encoder / decoder  Source code to compute all permutations of n elements  R code for model-free, data-driven confidence intervals  R tutorial to produce nice graphs and maps with 256 colors  Ridge regression with bootstrap (C and Perl version)  Simple solutions to make videos with R  Block tolerant web scraper on AWS  Simple source code to simulate nice cluster structures  Simple Javascript code to display and rotate ads on a website  Perl code for credit card number validation  Other recommended reading Book: Superforecasting: The Art and Science of Prediction Big Data and Law Enforcement — a Marriage Made in H_______!

Data is an unusual currency. Most currencies exhibit a one-to-one transactional relationship. For example, the quantifiable value of a dollar is considered to be finite — it can only be used to buy one item or service at a time, or a person can only do one paid job at a time. But measuring the value of data is not constrained by those transactional limitations. In fact, data currency exhibits a network effect, where data can be used at the same time across multiple use cases thereby increasing its value to the organization. This makes data a powerful currency in which to invest.

Staying current in the programming field can sometimes make you feel like the Red Queen in "Alice Through the Looking-Glass." She said, "It takes all the running you can do to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!" You're a master at Ruby on Rails? Great, but how are you with statistical analysis in R? Want to work at Google? Forget Python and start building channels in Go.

Many organizations today are vulnerable to a variety of threats that, if successful, can put their customers, employees and brand at serious risk. Staying in front of these varying and evolving threats requires continuously monitoring and analyzing disparate data sets and then fusing the resulting insights into one intelligence picture. That picture gives organizations a comprehensive understanding of not just their threats, but also the vulnerabilities across their enterprise.

So, it's been a month since Hadoop Summit San Jose, where over 5000 of the leading tech innovators in big data came together to share their inventions, wisdom and know-how. One of the sessions – a powerpoint free zone, was Data Hacks & Demos, a keynote session hosted by Joe Witt and starring an international…

Cryptographic identities coupled with a global IP Namespace capability also means that two endpoints can open a secure communications channel.

FR8 Revolution is a company that was founded a year ago with the vision of bringing data-driven, cloud-based tools to the trucking market. It can be easy to forget about this industry when you shop at your favorite store, fast food outlet or e-commerce site, but in order to ensure your purchase happens seamlessly, a huge amount of logistics has to occur. It requires a lot of planning to ensure, for example, that a McDonald's outlet in New York City in the middle of winter has fresh lettuce and tomatoes and enough french fries for demand. Where that planning rubber literally and figuratively meets the road is in the trucking industry. The U.S. trucking industry alone is a $700 billion market and employs a staggering 3.5 million drivers. And while we've all been told a number of times that the future of trucking is autonomous vehicles, the truth behind that somewhat cliched statement is that it's going to take a huge amount of software to actually make that prophecy come true.

My friend Dan sent me this press release (since he knows that I like all things "Data Analytics" related). In the press release, "Boeing Announces Data Analytics Agreements with Six Airlines," Boeing announces that they are providing advanced analytic solutions to several airline customers including: All Nippon Airways (ANA) signed a renewal contract for Airplane Health Management (AHM) on its entire future fleet of Boeing 787 aircraft. ANA uses AHM tools to monitor their aircraft in real time and proactively manage maintenance operations more efficiently.

July was a busy month of big data solutions on the Big Data Blog. The month started with our most popular story yet, Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE. It was a great post to start a spectacular month. Take a look at our summaries below. Learn, comment, and share. Thank you for reading the AWS Big Data Blog! Installing and Running JobServer for Apache Spark on Amazon EMR In this blog post, learn how to install JobServer on EMR using a bootstrap action (BA) derived from the JobServer GitHub repository. Then, run JobServer using a sample dataset.

We recently had to decomission inside-r.org, the R resources site created by Revolution Analytics in 2010. Links to inside-r.org now redirect to MRAN, which now contains much of the material which…

"From coast to coast, the FBI and Securities and Exchange Commission have ensnared people not only at hedge funds, but at technology and pharmaceutical companies, consulting and law firms, government agencies, and even a major stock exchange." — Preet Bharara, U.S. Attorney for the Southern District of New York, 2013; while announcing charges in a massive… The post Ensuring Market Integrity & Investor Protection via Trade Surveillance – Part 1 appeared first on Hortonworks.

Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits — to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at Things Expo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet and mobile devices, manufacturers can use real-time production data to make adjustments which increase optimal uptime and efficiency in operations. Whether it's personal wellness or improving the "wellness" of your manufacturing assets, to get "fit" and improve performance, you need to get connected, get insights, and get optimized.

Cloud analytics is dramatically altering business intelligence. Some businesses will capitalize on these promising new technologies and gain key insights that'll help them gain competitive advantage. And others won't. Whether you're a business leader, an IT manager, or an analyst, we want to help you and the people you need to influence with a free copy of "Cloud Analytics for Dummies," the essential guide to this explosive new space for business intelligence.

By supplying their own encryption keys, organizations can reduce the chance that an outside party will be able to gain access to their data, at least while it's at rest. Now Google's cloud platform offers the option.

Velostrata 2.0 makes it simpler to test the performance of an application workload on a particular cloud before committing to that platform.

Dell Services on Tuesday unveiled its new innovation lab for SAP HANA, intended to help its customers evaluate the SAP HANA in-memory relational database management system (RDBMS) and SAP S/4HANA business suite. The lab will also include a co-innovation lab to help customers develop and test solutions, and temporarily use the environment for upgrades and rollout projects. Dell Services is also offering a set of consulting services around the innovation lab to help customers identify potential value, create a business use case and develop zero-impact migration plans. "A lot of our existing SAP customers we service have questions," says Simon Spence, global director of the Dell SAP practice, Dell Services.

I've heard several clients complain about the curse of "orphaned analytics"; which are one-off analytics developed to address a specific business need but never "operationalized" or packaged for re-use across the organization. Unfortunately, many analytic organizations lack a framework for ensuring that the analytics are not being developed in a void. Organizations lack an overarching model to ensure that the resulting analytics and associated organizational intellectual capital can be captured and re-used across multiple use cases. The post How to Avoid "Orphaned Analytics" appeared first on InFocus.




Data theft is rampant, and companies don't seem to know what to do to stop it, according to a recent study. This infographic breaks it down for IT professionals.

In anticipation of his upcoming conference co-presentation, Predictive Analytics for Stress Testing — Industry Challenges at Predictive Analytics World Financial in New York City, October 23-27, 2016, we asked Sanjay Gupta, Executive Vice President and Head of Model Development at PNC Bank, a few questions about his work in predictive analytics. Q: In your work with… The post Wise Practitioner – Predictive Analytics Interview Series: Sanjay Gupta at PNC Bank appeared first on Predictive Analytics Times.

Zaloni, the data lake company, announced a new solution, Zaloni's Data Lake 360°, to meet the needs of a growing number of enterprises that understand that the data lake is key to the future of the enterprise data ecosystem.

FR8 Revolution is a company that was founded a year ago with the vision of bringing data-driven, cloud-based tools to the trucking market. It can be easy to forget about this industry when you shop at your favorite store, fast food outlet or e-commerce site, but in order to ensure your purchase happens seamlessly, a huge amount of logistics has to occur. It requires a lot of planning to ensure, for example, that a McDonald's outlet in New York City in the middle of winter has fresh lettuce and tomatoes and enough french fries for demand. Where that planning rubber literally and figuratively meets the road is in the trucking industry.

My friend Dan sent me this press release (since he knows that I like all things "Data Analytics" related). In the press release, "Boeing Announces Data Analytics Agreements with Six Airlines," Boeing announces that they are providing advanced analytic solutions to several airline customers including: All Nippon Airways (ANA) signed a renewal contract for Airplane Health Management (AHM) on its entire future fleet of Boeing 787 aircraft. ANA uses AHM tools to monitor their aircraft in real time and proactively manage maintenance operations more efficiently.

July was a busy month of big data solutions on the Big Data Blog. The month started with our most popular story yet, Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE. It was a great post to start a spectacular month. Take a look at our summaries below. Learn, comment, and share. Thank you for reading the AWS Big Data Blog! Installing and Running JobServer for Apache Spark on Amazon EMR In this blog post, learn how to install JobServer on EMR using a bootstrap action (BA) derived from the JobServer GitHub repository. Then, run JobServer using a sample dataset. Process Large DynamoDB Streams Using Multiple Amazon Kinesis Client Library (KCL) Workers A previous post, described how you can use the Amazon Kinesis Client Library (KCL) and DynamoDB Streams Kinesis Adapter to efficiently process DynamoDB streams.

"From coast to coast, the FBI and Securities and Exchange Commission have ensnared people not only at hedge funds, but at technology and pharmaceutical companies, consulting and law firms, government agencies, and even a major stock exchange." — Preet Bharara, U.S. Attorney for the Southern District of New York, 2013; while announcing charges in a massive… The post Ensuring Market Integrity & Investor Protection via Trade Surveillance – Part 1 appeared first on Hortonworks.

Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits — to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at Things Expo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet and mobile devices, manufacturers can use real-time production data to make adjustments which increase optimal uptime and efficiency in operations.

Cloud analytics is dramatically altering business intelligence. Some businesses will capitalize on these promising new technologies and gain key insights that'll help them gain competitive advantage. And others won't. Whether you're a business leader, an IT manager, or an analyst, we want to help you and the people you need to influence with a free copy of "Cloud Analytics for Dummies," the essential guide to this explosive new space for business intelligence.

By supplying their own encryption keys, organizations can reduce the chance that an outside party will be able to gain access to their data, at least while it's at rest. Now Google's cloud platform is offers the option.

Velostrata 2.0 makes it simpler to test the performance of an application workload on a particular cloud before committing to that platform.

Dell Services on Tuesday unveiled its new innovation lab for SAP HANA, intended to help its customers evaluate the SAP HANA in-memory relational database management system (RDBMS) and SAP S/4HANA business suite. The lab will also include a co-innovation lab to help customers develop and test solutions, and temporarily use the environment for upgrades and rollout projects. Dell Services is also offering a set of consulting services around the innovation lab to help customers identify potential value, create a business use case and develop zero-impact migration plans. "

I've heard several clients complain about the curse of "orphaned analytics"; which are one-off analytics developed to address a specific business need but never "operationalized" or packaged for re-use across the organization. Unfortunately, many analytic organizations lack a framework for ensuring that the analytics are not being developed in a void. Organizations lack an overarching model to ensure that the resulting analytics and associated organizational intellectual capital can be captured and re-used across multiple use cases. The post How to Avoid "Orphaned Analytics" appeared first on InFocus.

Data theft is rampant, and companies don't seem to know what to do to stop it, according to a recent study. This infographic breaks it down for IT professionals.

This entry was posted in News and tagged , , , , , , , , . Bookmark the permalink.