Big Data News – 28 Feb 2017

Today's Infographic Link: Why The Media Isn’t The “Enemy”

Featured Article
The information gathered about us by the internet giants makes our political system vulnerable to new forms of manipulation.

Top Stories
Many new machine learning technologies, architectures, and algorithms being proposed, but here are three macro trends that will become game changers in ML.

You may have been successful with a center of excellence for your business intelligence or analytics practice. But big data success requires a different approach, a bus.

We've killed more people because we didn't share data than because we did.  This and other such memorable quotes were among my many takeaways from the HIMSS annual conference last week in Orlando. Along with the increase in healthcare consumerism, there is a growing awareness among health systems of the need to involve patients in their medical care. An important aspect of enabling that is to harness the continuous stream of data from wearables, sensors and other internet of things (IoT) devices that is now referred to as patient-generated health data or PGHD. PGHD can provide valuable insights to clinicians in population health management (PHM) and personalized medicine.

Big data has long promised more than it delivers, at least for most enterprises. While a shift to cloud pledges to help, big data deployments are still more discussed than realized, with Gartner insisting that only 14 percent of enterprises have gotten Hadoop off the ground. Will the other darling of the chattering class, IoT (internet of things), meet the same fate? In fact, IoT might deliver, according to new data from Talend compiled in conjunction with O'Reilly. Dubbing 2016 "the year IoT 'grew up,'" the report declares 2017 the year that "IoT starts to become essential to modern business."

Augmented reality is seen as a stop-gap measure to VR, but this view overlooks that in the digital ecosystem it's the app that counts, not the tech.

The big news, not surprisingly, centers on 5G. The question is whether the announcements are more hype, more substance, or an equal measure of each.

It's a great time to be a data scientist. These professionals are responsible for helping gather and manage an organization's data in a way that's meaningful to business decision makers. The right candidates have experience working with major database platforms, are experienced coders and must have strong analytical, quantitive and problem-solving abilities. They're also in high demand — and generously compensated — with data scientist claiming the top spot on Glassdoor's best job of 2017 and Robert Half Technology's tech jobs to watch list.

sAP has unveiled a software development kit (SDK) that can be used to integrate native iOS applications with both SAP and third-party applications.

We don't know how long this has been going on, how companies are reacting, or what may be out there. We'll be hearing about Cloudbleed a lot in 2017.

A graph, a collection of nodes connected by edges, is just data. Whether it's a social network (where nodes are people, and edges are friend relationships), or a decision tree (where nodes are branch criteria or values, and edges decisions), the nature of the graph is easily represented in a data object. It might be represented as a matrix (where rows and columns are nodes, and elements mark whether an edge between them is present) or as a data frame (where each row is an edge, with columns representing the pair of connected nodes). The trick comes in how you represent a graph visually…

IT thinks of hybrid clouds as multiple server platforms linked to support a single app. The Nimble Storage definition is a storage-centric approach.

if you're a CIO or IT leader, you probably think you're doing a better job of giving company employees what they need than they think you're doing.

BARCELONA — Almost a year after SAP teamed with Apple to develop business applications for smartphones and tablets, the German enterprise software developer is ready to unveil the first fruits of their partnership. On March 30, it plans to release the first version of SAP Cloud Platform SDK for iOS, a tool to help businesses integrate Apple's handheld devices with their back-end information systems. And at Mobile World Congress in Barcelona this week it opened enrollment for SAP Academy for iOS, a mix of paid and free training services to help develop apps with that tool.

Dell is making the case for a standardized series of gateways based on Intel Atom processors that in turn connect to multiple sensors.

Executives at several top tech firms outline the skills they need now and in the near future, including IaaS and IoT security expertise.

For chip designers, justification for high-end models will have to be strong to maintain share of a market less interested in hardware details.

Flexibility to alter and extend your most vital software assets is more important today than ever before, giving open source software an enormous advantage. Traditional software vendors aren't able to keep up with what's happening. Just look at the reality as the world has rapidly gone digital: Business models are undergoing insane change. Companies that… The post Everyone Wants Flexibility And Freedom. Or Should. appeared first on Hortonworks.

Almost a year after SAP teamed with Apple to develop business applications for smartphones and tablets, the German enterprise software developer is ready to unveil the first fruits of their partnership. On March 30, it plans to release the first version of SAP Cloud Platform SDK for iOS, a tool to enable businesses to integrate Apple's handheld devices with their back-end information systems. And at Mobile World Congress in Barcelona this week it opened enrollment for SAP Academy for iOS, a mix of paid and free training services to help develop apps with that tool. It may have looked as though Apple were retreating from the enterprise when it axed its Xserve rack-mounted server line in 2011, it, but since then it has multiplied its partnerships with enterprise hardware, software and service vendors, most notably IBM in 2014, Cisco Systems in 2015 and, last year, SAP.

Two-thirds of companies that leverage advanced analytics strategies say they achieved revenue growth of 15% or more.

Data integration has always faced complex technical issues. Obviously the technically savvy will always be needed for many aspects of data integration, and require solutions that can handle complex problems. But this approach isn't enough for what businesses need and want today from Modern Data Integration. Making room for business ubiquity — business user participation and input — continues to be a challenge.

If data is important to your company, then creating a data governance program is as essential to the business as an accounting program.

New products of the week Image by Transition Networks Our roundup of intriguing new products. Read how to submit an entry to Network World's products of the week slideshow. ONLYOFFICE app for ownCloud Image by ascensio

BARCELONA — Enterprises and cloud companies will start trying their hands at cellular this year, Nokia president and CEO Rajeev Suri predicts. "Enhanced reality" and events such as concerts may be where cloud giants first get into mobile services, Suri said at a Nokia event here on the eve of Mobile World Congress. "The first webscale players will enter the wireless access domain with mainstream technologies," Suri said. Webscale usually refers to operators of big clouds, like Google, Facebook, and Alibaba. Suri didn't name any names. For enterprises, an emerging technique called network slicing will allow them to virtually run their own private services on mobile operator networks. Meanwhile, systems that bring LTE to unlicensed or shared frequencies, like LAA (Licensed Assisted Access), will also help open doors to private cellular networks. Nokia is already working with some energy utilities on these kinds of deployments, and at MWC it will join Qualcomm in demonstrating a private LTE network, Suri said.

The fastest analytics now at the lowest price You know that HPE Vertica delivers much faster analytical speed at scale than your data warehouse, database, or NoSQL solution.  But can you quickly justify your initial purchase? You can now, but you better hurry.  Take advantage of the time-limited Vertica March Madness Promotion and get HPE Vertica for just $1,000 per TB*!  Designed exclusively for new HPE Vertica customers, the Vertica March Madness Promotion includes: One year term-license of HPE Vertica Enterprise Express Edition $1,000 per TB for an unlimited number of TBs Commercial technical support for the one-year term There's never been a better time to join the thousands of data-driven organizations that make all of their business-critical decisions with HPE Vertica – don't settle for good enough! Take the next step and act fast — this special offer expires on March 15, 2017!

Qlik® Is Positioned as a BI Leader Once Again Gartner Magic Quadrant 2017 If you're shopping for a BI solution in 2017, Gartner's Magic Quadrant report is the perfect way to learn about important changes in the market. You'll also hear about one thing that's stayed the same: For the seventh year in a row, Qlik is positioned in Gartner's Leaders quadrant based on completeness of vision and ability to execute. Get the must-read BI report of the year: Gartner: Magic Quadrant for BI and Analytics Platforms

Phone makers will seek to seduce new buyers with artificial intelligence functions and other innovations at the world's biggest mobile fair starting Monday in Spain.

Find out how Hershey's leveraged the Internet of Things, cloud computing, machine learning, and big data to regulate production at its factories, without hiring a data scientist.

Last night, Cloudflare and Google's Project Zero announced a security incident which may have affected Quantopian. We think it is highly unlikely that Quantopian has been adversely impacted, and we are taking several precautionary steps to reduce any potential impact. Quick Recommended Steps We've boiled it all down to these two steps. Change your Quantopian password at Then, if you are using two-factor authentication (2FA), disable and re-enable 2FA and download a new recovery code.

Facebook is a famously data-driven organization, and an important goal in any data science activity is forecasting. Now, Facebook has released Prophet, an open-source package for R and Python that implements the time-series methodology that Facebook uses in production for forecasting at scale. Prophet has a very simple interface: you pass it a column of dates and a column of numbers, and is produces a forecast for the time series, like this: The black dots are the number of views of Peyton Manning's Wikipedia page through the end of 2016; the blue region is a forecast (with uncertainty interval) into 2017. As you can see, the Prophet forecast automatically detects the seasonal cycles (presumably related to NFL seasons). The prophet function also provides options to explicitly model weekly and/or yearly seasonality, account for holidays, and to specify changepoints where discontinuities in the time series are expected.

News this week included 5G tests, a new Wi-Fi spec, mobile video popularity and smart mobility efforts.

If a CEO wants to know the state of their business, they ask their highest ranking executives. These executives, in turn, should know the state of the business through reports from their subordinates. This structure is roughly analogous to a process observed in deep learning, where each layer of the business reports up different types of observations, KPIs, and reports to be interpreted by the next layer of the business. In deep learning, this process can be thought of as automated feature engineering. DNNs built to recognize objects in images may learn structures that behave like edge detectors in the first hidden layer. Proceeding layers learn to compose more abstract features from lower level outputs. This episode explore that analogy in the context of automated feature engineering. Linh Da and Kyle discuss a particular image in this episode. The image included below in the show notes is drawn from the work of Lee, Grosse, Ranganath, and Ng in their paper Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations.  

Survey by Atlassian finds wide worker expectbations of working alongside AI software and discomfort with the prospect.

Both Gartner and International Data Corporation (IDC) project huge increases in the public cloud market, according to new studies released by both firms this week.

The Carrier Ethernet (CE) market is either doing well or undergoing challenges. It all depends on who you talk to. The major players weigh in.

Network access control software that Aruba sells under the ClearPass Security brand will be informed by machine learning, Big Data analytics by Niara.

Yes, Big Iron can do Big Data and Machine Learning, even while it keeps chugging away at its appointed transactional tasks. In fact, putting the two together makes all kinds of sense.

Recruiting firm Randstad's annual salary survey shows data pros are in the upper stratosphere, and DevOps experience is starting to justify a premium.

From the local dive bar to the world's largest beer brands, big data empowers bar owners and brewers with information to better serve their customers, to reduce waste, and to build their bottom line.

It's essential to adopt a broader scope about data and analytics, create a very flexible and agile IT framework, and build a strong foundation for data science.

President Donald Trump said this week that the federal budget is a "mess" and is promising to make it leaner. This means that federal IT spending — now at $81.6 billion — is likely to see cuts, analysts said. The Trump administration is still filling top technology policy positions, including replacing former federal CIO Tony Scott, who left last month. Scott, a former CIO of Microsoft and The Walt Disney Co., was appointed by President Barack Obama in February 2015. For now, all eyes are on former U.S. Rep. Mick Mulvaney (R-S.C.), Trump's just-confirmed budget director. Elected in 2010, Mulvaney was part of the Tea Party wave and a member of the conservative House voting block, the Freedom Caucus.

Predictive analytics are used in the connected building industry to find trends and alert Trane customers and service technicians on what needs action. This impacts the bottom line of a company.

Make no mistake, this is not a random academic pursuit. It is of utmost importance in an era where automation is the next big thing

The advisor called as AISHA will provide a chat function through which customers can get fashion advice on their fashion and wearing product requirements

Data is the single biggest raw asset that BASF has, according to Frithjof Netzer, chief digital officer for BASF. The company has a process to farm, refine and market its data. BASF also uses an in-house process called Innorate to create disruptive business ideas.

The rationale for shared infrastructure is simple: Service providers and carriers increasingly seek to offer both wireless and wired services.

The internet is a tough place to have a conversation. Abuse has driven celebrities and ordinary folks from social media platforms that are ill-equipped to deal with it, and some publishers have switched off comment sections. That's why Google and Jigsaw (an early stage incubator at Google parent company Alphabet) are working on a project called Perspective. It uses artificial intelligence to try to identify toxic comments, with an aim of reducing them. The Perspective API released Thursday will provide developers with a score of how likely users are to perceive a comment as toxic.  In turn, that score could be used to develop features like automatic post filtering or to provide users with feedback about what they're writing before they submit it for publication. Starting on Thursday, developers can request access to Perspective's API for use in projects they're working on, and Jigsaw will approve them on a rolling basis.

How would you feel about an artificial intelligence system handling your taxes this year? H&R Block, the tax services company, is betting that customers will be willing to have A.I. assist their human tax preparers in getting them the biggest refunds possible or at least reduce how much they owe. The company, which has about 12,000 offices in the U.S. and prepares 24.2 million tax returns worldwide, is using IBM's A.I.-based Watson to do it. "We are introducing something this tax season that is totally new, and is in fact, a first in the tax preparation category," said Bill Cobb, H&R Block's president and CEO, in a statement. "By combining the human expertise, knowledge and judgment of our tax professionals with the cutting-edge cognitive computing power of Watson, we are creating a future where our clients will benefit from an enhanced experience and our tax pros will have the latest technology to help them ensure every deduction and credit is found."

One person can't call an industry dead and make it happen unless the vendors agree. HP Inc. is a great example of a firm rejecting that pronouncement.

After more than a year in preview R Tools for Visual Studio, the open-source extension to the Visual Studio IDE for R programming, is nearing its official release. RTVS Release Candidate 1 is now available for download, giving you the opportunity to try out the new features ahead of the official announcement.

Listen to Preetam Kumar as he speaks about how you can solve the real-world optimization problems with IBM Decision Optimization on Cloud.

I decided to take a break from my Cybersecurity Architecture series and CISO's View series to give my thoughts on this year's RSA conference while things are still fresh.  First off, I enjoyed meeting with old colleagues and many security people that I respect which justified the trip as far as I'm concerned.  I'm really amazed… The post RSA 2017: A Sea of Solutions needing Big Data appeared first on Hortonworks.

The risk is rising, but the medical industry is better recognizing the risk to patient data and is stepping up its cybersecurity efforts.

Whatever the details, Robotic Process Automation, RPA, as a category is expanding rapidly, and possibly freeing humans for more sophisticated jobs.

By providing features for large team collaboration such as native multi-user security, Spark pipelines, enriched dashboards, 3rd party integrations with Slack and Hipchat for activity notifications, interactive hierarchical clustering, and a plethora of new features, Dataiku DSS 4.0 improves the ability for organizations to develop and manage enterprise data science projects.   New York, NY – February 23 – Dataiku, the maker of the enterprise-grade platform for data teams, Dataiku Data Science Studio (DSS), has today announced the release of Dataiku DSS 4.0, which introduces new functionalities that improve the production, development, and management of enterprise data science projects.  

Tata's MOVE platform lets IT use an existing programmable network that comes complete with a pre-defined suite of application programming interfaces.

Tata's MOVE platform lets IT use an existing programmable network that comes complete with a pre-defined suite of application programming interfaces.

With the introduction of the Hortonworks Data Cloud (HDCloud), deploying clusters and starting to process data has become an order of magnitude faster. When Apache Hadoop evolved from being an on premise solution to a cloud based solution, the time it took to make a cluster went from weeks to days. The same magnitude of… The post Accelerating Time to Market with Hortonworks Data Cloud appeared first on Hortonworks.

Go into your search well-informed with respect to what IT pros in your area are making, with detailed info from the Randstad 2017 Salary Guide.

Okay, I am weird (tell me something that I don't know, say most of my friends). For Christmas I wanted a Nike Apple Watch to go with my existing FitBit and Garmin fitness trackers (I look sort of like a cyborg in the photo below…which is always cool).

Like agile software development, data science works best when models can be tested and iterated rapidly. The latest release from data science collaboration tool Dataiku adds integrations with GitHub and HipChat that will not only bring developers and data scientists together, but hopefully, some of their DevOps discipline as well.

The online user content capture system, Evernote, has moved 2.9 PBs of data into the Google Cloud and shut down its primary data center.

Using Text-based Predictive Models to Find New Opportunities for B-to-B Business, at Predictive Analytics World San Francisco, May 14-18, 2017, we asked Michael Dessauer, Data Scientist at The Dow Chemical Company, a few questions about his work in predictive analytics. Q: In your work with predictive analytics, what behavior or outcome do your models predict?

Between the weekly reports on plant performance, supplier KPIs and inventory levels, more data may be the last thing supply chain managers want to crunch.

Embracing the Multi-Cloud Approach

If self-service business intelligence initiatives are on your agenda, follow these 10 best practices for ensuring proper governance. is one of an increasing number of players that aims to offer sales teams tools to make them more efficient and effective. In its case, InsideSales came about through the post-graduate research of co-founder Dave Elkington. As Elkington studied artificial intelligence, he soon came to realize that A.I. has existed for decades — the math that is behind A.I. was used back in the mid-20th century by researchers at companies such as IBM. What is different today — and what gives the power to the disruptive companies to undermine their more conservative competitors — is the access to data. As Elkington sees it, Netflix's ability to put Blockbuster out of business was a direct result of Netflix's intentional strategy to amass information about its customers and, in doing so, to give its own predictive algorithms the best possible source data to tune its suggestions.

The big data pipeline is getting more crowded. Learn how to improve your company's big data throughput when going over the public internet.

With its Ryzen launch, AMD is avoiding making a strategic mistake it has made several times in the past, says analyst Rob Enderle.

If you're an Excel user (or any other spreadsheet, really), adapting to learn R can be hard. As this blog post by Gordon Shotwell explains, one of the reasons is that simple things can be harder to do in R than Excel. But it's worth perservering, because complex things can be easier. While Excel (ahem) excels at things like arithmetic and tabulations (and some complex things too, like presentation), R's programmatic focus introduces concepts like data structures, iteration, and functions. Once you've made the investment in learning R, these abstractions make reducing complex tasks into discrete steps possible, and automating repeated similar tasks much easier. For the full, compelling argument, follow the link to the Yhat blog, below. Yhat blog: R for Excel Users

Apple this week took administrative control of the domain, the last notable web address it did not govern that users could have linked with its online sync and storage service. According to WHOIS searches today, Apple acquired control of on Tuesday. Apple already ruled the primary top-level domains for iCloud, the cross-device, cross-OS service that stores files generated by iOS and macOS, and more importantly, synchronizes everything from Safari browser bookmarks to photographs between iPhones, iPads and Macs. Apple is on record as the owner of the domains,, and, for example.

IoT networks are unique: They will be worldwide, required to function where no established network exists, and have stringent power requirements.

Amazon Web Services is the consensus leader of the IaaS public cloud computing market according to industry watchers, but they credit Microsoft for closing the gap with Azure and say Google with its Cloud Platform has made considerable strides as well.

Prescriptive analytics (optimization) is a sophisticated analytics technology. It can deliver great business value by helping decision makers handle the tough trade-offs that arise when limited resources force choices among options. Optimization was traditionally applied by Operations Research professionals to solve operational problems, such as route optimization and logistics planning. With the advent of new technologies that make it possible to model larger, enterprise-wide problems, and provide broad support for what-if analyses, Prescriptive analytics now enables a new class of business analytics applications.

We will see shifts in ransomware, including how attackers target victims, who they'll target, and the role IoT will play in ransomware.

Using IBM Counter Fraud Management (CFM), an insurer can improve the operational effectiveness of its fraud prevention program and drive impressive fraud savings. IBM Watson helps insurers detect, respond and stop fraud with the ability to tap unstructured data.

In today's digital transformation, achieving the desired user outcome is now the driver of technology decisions, rather than the other way around.

Did political analytics firm Cambridge Analytica exaggerate the role analytics played in helping Donald Trump win? Crowdskout CEO Zack Christenson separates election tech facts from fiction.

In my spare time, I teach a fitness class that is offered in gyms around the world, and there are thousands of instructors worldwide. We all receive the same training, the same music, the same choreography. We are expected to deliver the same consistent experience to participants the world over. The expectation is that it… The post Stacking the Open Source Odds For Success appeared first on Hortonworks.

Governing bodies are pushing for an open standard to make it simpler for e-signatures to move through a process spanning apps from multiple vendors.

What are the limits of AI? And how do you go from managing data points to injecting AI in the enterprise?

Alex Sakaguchi, director, solutions marketing, Veritas: Company is committed to extend reach of software across as many relevant clouds as possible.

Before CDH 5.10, every CDH cluster had to have its own Apache Hive Metastore (HMS) backend database. This model is ideal for clusters where each cluster contains the data locally along with the metadata. In the cloud, however, many CDH clusters run directly on a shared object store (like Amazon S3), making it possible for the data to live across multiple clusters and beyond any cluster's lifespan. In this scenario clusters need to regenerate and coordinate metadata for the underlying shared data individually. The post How To Set Up a Shared Amazon RDS as Your Hive Metastore appeared first on Cloudera Engineering Blog.

About two years ago, Hortonworks donated the entire code base of about 440,000 lines from its XA Secure acquisition to the Apache Software Foundation (ASF) in order to help jump start Apache Ranger as an Apache Incubator project. Hortonworks made this decision because our enterprise customers need an extensible and robust open source security framework… The post It's Morphing Time: Apache Ranger Graduates to a Top Level Project – Part 1 appeared first on Hortonworks.

Mandy Chessell, Distinguished Engineer & Master Inventor discuses four key perspectives on Data Lakes and introduces a new video series.

Microsoft has launched Project Sangam, a cloud service integrated with LinkedIn that will help train and generate employment for middle and low-skilled workers. The professional network that was acquired by Microsoft in December has been generally associated with educated urban professionals but the company is now planning to extend its reach to semi-skilled people in India. Having connected white-collared professionals around the world with the right job opportunities and training through LinkedIn Learning, the platform is now developing a new set of products that extends this service to low- and semi-skilled workers, said Microsoft CEO Satya Nadella at an event on digital transformation in Mumbai on Wednesday.

This entry was posted in News and tagged , , , , , , , , , , , . Bookmark the permalink.