Big Data News – 11 Jan 2016

There's good news if you're for a job in data science in 2016 — the number of job openings in the field appears to be rising as companies look to leverage big data for competitive advantage. But actually landing a coveted data science job means having the right mix of skills, and you may be surprised to learn what skills are most in demand by employers. The folks at CrowdFlower recently did an analysis of the 3,490 postings for data science jobs on LinkedIn, and sorted out the top 21 individual skills that appear most often. Some of the results were not earth-shattering — SQL topped the list, to nobody's great surprise — while other results could be leading indicators on how the data science field is evolving. As mentioned, SQL was the most commonly cited skill, and is a requirement in 57 percent of all LinkedIn job postings for data science. Hadoop came in at number two, with a solid 49 percent rating. This did not surprise Lukas Biewald, CEO and founder of CrowdFlower, the San Francisco company whose workers create high-quality reference data that data scientist use to train analytic models. "It's not surprising the two skills that are at the top are SQL and Hadoop, which are the technologies that actually store the data," Biewald tells Datanami. "Every data scientist has to know how to get the data out.

Top Stories
VB INSIGHT: Oracle's acquisition of AddThis on Tuesday — for a rumored $175 million — brings the data giant one step closer to becoming the Mordor of marketing clouds. Oracle has a persistent pattern of expanding its capabilities by buying companies, giving them a "one ring to rule them all" platform for marketing technology. AddThis is best known for its iconic social sharing buttons on websites.

Historically, data and analytics have been key to the success of manufacturing. The biggest contributor to the success of the 100+ year old assembly line technology was the development of interchangeable parts. Clearly, data was central to this concept. It was really the emphasis on making interchangeable parts that led to the invention of statistical… The post Predictive Analytics Can Help with the Challenges Facing Manufacturing in the 21st Century appeared first on Predictive Analytics Times.

In the talk below, Recursive Deep Learning for Modeling Compositional and Grounded Meaning, Richard Socher, Founder, MetaMind describes deep learning algorithms that learn representations for language that are useful for solving a variety of complex language tasks.

In this special guest feature, Agata Kwapien of at datapine, explains the virtues of visual reporting tools and methods as an alternative to legacy processes.

In the world of Digital Transformation, UX & CX are extremely important to keep all the stakeholders engaged. But what is different between them? User Experience-UX is when customer starts…

Regular readers of this blog know of my fondness for Daft Punk's already-classic Get Lucky, so this video of the song as played by 10 famous guitarists was an instant win for me. Hope you enjoy it…

Each site that makes use of OAuth to authenticate an end user only receives a token via the OAuth application programming interface (API).

What can predictive analytics do for a sports club? Behavior-based predictive analytics technologies provided by IBM are poised to help the Ottawa Senators professional hockey club understand better than ever before the individual consumers who make up its fan base.

EXCLUSIVE: Data analytics startup Mixpanel yesterday laid off nearly 20 employees, primarily in sales, VentureBeat has learned. The layoffs, which multiple sources familiar with the matter confirmed to VentureBeat, don't spell doom for the company.

Carl Weinshenck does a roundup of the news this week on the state of Blackberry, wearables and SD-WANs.

This was the subject of a popular discussion recently posted on Quora: 20 questions to detect a fake data scientist. We asked our own data scientist, and he came up with a very different set of questions: compare his answer (#1 below – 20 questions) with Quora replies (#2 and #3 below – 30 questions). Note that #2 focuses on statistics, and #3 on architecture. The link to the original Quora discussion is also provided in this article. Which questions would you add or remove?

Samsung is betting that demand for storage that goes beyond what comes in the typical PC device will increase considerably.

An increasingly analytics-based infrastructure management stack is all but inevitable in the enterprise.

A new report issued this week by the Federal Trade Commission applauds big data practitioners for helping to increase access to goods, services, healthcare, education, and employment. But the FTC also warned businesses and organizations to be aware of "hidden biases" that may creep into their calculations. In "Big Data: A Tool for Inclusion or Exclusion," a 50-page report you can download here, the FTC looks at both sides of the big data aisle. On the plus side, big data can benefit people of all stripes, according to the FTC.

EnterpriseDB® (EDB™), a leading enterprise Postgres database company, announced the general availability of PostgreSQL 9.5, released by the Postgres community. The new 9.5 version boosts performance and scalability, enhances productivity with data analytics.

The scientific process has been going through a welcome period of introspection recently, with a focus on understanding just how reliable the results of scientific studies are. We're not talking here…

List: With the EU's General Data Protection Regulation looming, it's important to get the ball rolling for how you will comply to the rules.

Analysis: Prohibitive pricing is resulting in a failure to democratise data past the data analyst

The following post was originally published in the Ibis project blog. (Ibis is a data analysis framework incubating in Cloudera Labs that brings Apache Hadoop scale to Python development.) The new Apache Kudu (incubating) columnar storage engine together with Apache Impala (incubating) interactive SQL engine enable a new fully open source big data architecture for data that is arriving and changing very quickly.

VB WEBINAR: Email marketing is the undisputed marketing ROI leader, the preferred channel for your customers–so why are you losing money? Join our live webinar to find out from the experts how personalization can dramatically increase open rates, click-through rates (CTR), and revenue. Register here for free!  Consumers are getting fed up with the barrage of irrelevant marketing messages cluttering up their inboxes, and marketers are losing their opens, their click-throughs, and their subscribers in droves. VB Insight analyst Andrew Jones dug deep into the problem.

The rise of converged infrastructure, flash and hybrid cloud is disrupting what has mostly been a static industry.

Personalization efforts have served as a vanguard for big data at many organizations. But some of these programs still fall short. Here's a closer look at what's going wrong. Does your enterprise fit these profiles?

The year 2015 was really just an extension of the year before in terms of cyber security. It featured lots of high-profile data breaches, some garden-variety malware attacks, and a lot of attention on cryptocurrencies like Bitcoin. But just as disruptive technologies like cloud, mobile, and data analytics have finally begun to settle into the humdrum of mainstream acceptance, the latest innovations in cyber-crime are taking new forms, as well.

Just as today's businesses need to speed operations and cut costs, the same goals hold true for major sporting events such as the Australian Open Tennis Championships. To help make the Australian Open 2016 event the ultimate experience for online fans and the fans in attendance, see how the IBM Bluemix platform and streaming analytics brings all the action to the cloud.

Effective sales compensation is vital to business growth–but outdated processes and systems offer only inefficiency. Updating to current sales performance management techniques can help organizations optimize sales and contribute to success.

If you are actively destroying someone's ability to earn a living, asking them for any help is not likely to be very successful.

Cognitive analytics is innovating and evolving rapidly. Expert predictions in this area are essential for organizations that plan to leverage cognitive analytics in their big data analytics strategies in 2016 and beyond

Following other AI and facial recognition acquisitions during the last few months, Apple has now snapped up Emotient, a San Diego-based startup that specializes in face-based emotional analysis.

In anticipation of his upcoming conference presentation, Predictive Sales Targeting in the Energy Industry, at Predictive Analytics World San Francisco, April 3-7, 2016, we asked Nate Watson, President at Contemporary Analysis, a few questions about his work in predictive analytics. Q: In your work with predictive analytics, what behavior oroutcome do your models predict? A: We… The post Wise Practitioner – Predictive Analytics Interview Series: Nate Watson at Contemporary Analysis appeared first on Predictive Analytics Times.

In this slide deck presentation below, Jake VanderPlas, discusses how you can use your coding skills to "hack statistics" — to replace some of the theory and jargon with intuitive computational approaches such as sampling, shuffling, cross-validation, and Bayesian methods.

Talk to any marketer with massive volumes of data to analyze, and there's a good chance you'll find Hadoop running somewhere in the background. With those users in mind, Manthan on Thursday introduced what it calls the first "bolt-on" customer-analytics tool designed specifically for the popular big-data framework.

Hadoop and big data technologies, in general, have changed the rules for not only data management but also data integration. Business critical insights derived from advanced analytics by combining data sourced across Apache Hadoop, cloud, and back-office systems and delivered in real-time to business users enable organizations to deliver superior products and services, and provide them with the agility to function more efficiently.

Those who follow big data technology news probably know about Apache Spark, and how it's popularly known as the Hadoop Swiss Army Knife. For those not so familiar, Spark is a cluster computing framework for data analytics designed to speed up and simplify common data-crunching and analytics tasks. Spark is certainly creating buzz in the big data world, but why? What's so special about this framework? Spark is Speedy: Spark is incredibly fast.

The biggest thing you need to know about Hadoop is that it isn't Hadoop anymore. Between Cloudera sometimes swapping out HDFS for Kudu while declaring Spark the center of its universe (thus replacing MapReduce everywhere it is found) and Hortonworks joining the Spark party, the only item you can be sure of in a "Hadoop" cluster is YARN. Oh, but Databricks, aka the Spark people, prefer Mesos over YARN — and by the way, Spark doesn't require HDFS.

News: Latest version from EnterpriseDB includes a focus on big data and the enterprise with row-level security and BRIN indexing.

According to the CMO Council on average 60 % of marketers' time was devoted to digital marketing activities. Digital Marketing saw many highlights this year like Tesco's version of its own shopping app on Google Glass, Facebook limiting promotional content in their News Feeds, YouTube showing further platform strength by generating over 1 billion views. By September, Instagram introduced new worldwide advertising capabilities and Google welcomed advertising capabilities like lookalike campaigns.  The industry was truly booming. Marketers, how did 2015 go for you? Feel successful? Can you be even more successful?

PyCUDA  is a great library if you want to use gpu computing with NVIDIA chips.  If you want a more portable approach or if you have ATI chips instead of NVIDIA, then you might consider PyOpenCl instead of PyCUDA. 

If your organization overlooks verbal communication as an analytics source, it's missing an opportunity for a competitive advantage.

Talk to any marketer with massive volumes of data to analyze, and there's a good chance you'll find Hadoop running somewhere in the background. With those users in mind, Manthan on Thursday introduced what it calls the first "bolt-on" customer-analytics tool designed specifically for the popular big-data framework. Customer360 is Manthan's longstanding customer-analytics offering, and it offers an array of packaged descriptive, predictive and prescriptive analytics capabilities. Algorithms are optimized for Apache Spark, giving marketers the speed of that data-processing tool as they analyze terabytes of data to do things like make campaign decisions or predict churn.

Thanks to MeriTalk for featuring this Q&A on their blog and allowing us to syndicate it here. By Dan Verton and Diana Manos, MeriTalk Advancements in big data and analytics have given agencies an unprecedented ability to leverage information to drive actionable insight, to understand constituent needs,  and to provide personalized services from real-time traffic and weather updates, to neighborhood crime rate analysis and preventative health care research.

Amazon's jump into the chip business won't change what's in Fire devices — for now — but it'll help the retailer drive more media delivery, file storage and cloud systems in homes and data centers. Annapurna Labs, an Amazon subsidiary, said it would start selling a line of ARM-based chips for hardware that handles 4K video delivery, storage, IoT, cloud and networking. The chips will be sold to makers of products for homes and data centers. The announcement surprised many, since selling chips is a radical shift from Amazon's bread-and-butter retail business. But the company has jumped outside its comfort zone before, dabbling in new businesses such as Web hosting with AWS (Amazon Web Services), which has become a runaway success.

The observable universe became a bit more manageable this week when international astronomers moved to get their arms around their massive datasets through a partnership with the developers of the integrated Rule-Oriented Data System, or iRODS. Data scientists who developed the open source iRODS framework said it would provide astronomers studying the origins of our galaxy and mysterious dark matter with a better way to query their huge data volumes, store and retrieve data and metadata as well as transport files and images.

An email marketing strategy can result in a 4300 percent return on investment (ROI). If you're not seeing that kind of return, give these tips a try.

Rather than build an online app store from the ground up, Wacom chose to build on top of application store technology from AppDirect.

Making the IT/business relationship work takes time and focus, but an investment in having the right conversations will pay great dividends.

Connected homes and appliances are rapidly increasing as consumers recognize the possibilities of how a connected home can simplify life. In the next four years, more than 470 million connected appliances are expected to be installed in homes worldwide. Take a look at this estimated growth, and learn more about how the connected world is poised to significantly change daily life.

Researchers at the University of Arizona's Artificial Intelligence Laboratory are trying to get into the minds of hackers to anticipate their plan of attack.

The more we know, the more seriously cybersecurity will be taken. It would be helpful if agencies like the SBA put their own house in order.

As time passes, what a smart city really is will become more clear, and start to bring real services to the people who live and work in those cities.

This is an exciting time to be in enterprise software. With the cloud and software as a service, we have a new chance to get it right.

By Matt Holzapfel, procurement/sourcing product lead, who previously held positions in Strategy at Sears Holdings and Strategic Sourcing at Dell Best-in-class procurement performers are quietly becoming a competitive advantage for their organizations by using new technology to capture hard to find improvement opportunities. In this series of blog posts, we'll examine the value procurement organizations are gaining by accelerating their technology agenda.   A comprehensive study done by AT Kearney in late 2014 showed that 75% of procurement organizations have not improved their productivity since 2011.

Apple has acquired Emotient, a startup with technology for recognizing emotions in people's faces in video content, according to a report today. This is Apple's latest in a series of recent acquisitions of companies well versed in artificial intelligence. In reporting the deal, the Wall Street Journal did not say how much Apple paid for Emotient.

If your company uses big data, be aware: the FTC is watching, and it's concerned. For all its potential benefits, big data can lead to discrimination and worsen economic disparity, the Federal Trade Commission warned in a new report that includes caveats and guidelines for businesses. Entitled "Big Data: A Tool for Inclusion or Exclusion?" the report stems from a 2014 FTC workshop by the same name and incorporates the public comments that followed. Among the report's conclusions is that big data can benefit under-served populations through better opportunities for education, credit, health care and employment. On the flip side, however, it can lead to reduced opportunities and the targeting of vulnerable consumers for fraud and higher prices.

If your company uses big data, be aware: the FTC is watching, and it's concerned. For all its potential benefits, big data can lead to discrimination and worsen economic disparity, the Federal Trade Commission warned in a new report that includes caveats and guidelines for businesses.

There are number of R packages devoted to sophisticated applications of Markov chains. These include msm and SemiMarkov for fitting multistate models to panel data, mstate for survival analysis applications, TPmsm for estimating transition probabilities for 3-state progressive disease models, heemod for applying Markov models to health care economic applications, HMM and depmixS4 for fitting Hidden Markov Models and mcmc for working with Monte Carlo Markov Chains.

Discover how the most effective financial institutions are outthinking financial criminals and winning the fight against fraud.

SPONSORED: This sponsored post is produced in association with Born2Global. Twenty years ago, South Korea pulled out of the Asian Financial Crisis in only three years–and its economy has been thriving ever since. With that same drive, South Korea has begun to position itself as the Asian startup hub, with the express intention of not only boosting the economy, but of sparking the interest of global venture capitalists.

We take pride in producing valuable technical blogs and sharing them with a wider audience. Of all the blogs published in 2015 on our website, the following were most popular: Take a look at 5 techniques enabling Hive to support both batch and interactive workloads at speed and scale. 5 Ways to Make Your Hive Queries Run Faster  See how Apache Storm and Apache Kafka work together. 

While fraud instances are down from a few years ago, it continues to be a prominent problem. The following are five steps that banks can take to reduce fraud incidents and detect threats faster.

Join us at the first Predictive Analytics Times Executive Breakfast of the new year. Gather at the San Francisco Marriott Marquis on Tuesday, April 5, 2016 at 8:00 AM to discover how predictive analytics works, and the ways in which it delivers value to organizations across industry sectors.

How data is stored, analyzed and processed is transforming businesses. According to MapR Technologies' CEO and Cofounder John Schroeder, the industry is in the midst of the biggest change in enterprise computing in decades.

In this special guest feature, Delroy Cameron, Data Scientist at URX, explores how brands and publishing platforms can benefit from big data provided by machine-learning powered knowledge graphs.

VB EVENT: Modern marketing can sound like a marriage of bloated buzzwords ("hyperlocal synergies," anyone?) or a deep dive into big data, where pools of analytics never let you come up for air.

Cloudability is moving to extend visibility into the cost of cloud computing by acquiring a provider of business intelligence software.

Many companies are still in the early stage of IOT adoption. However, by looking at already working IOT strategies, business managers might reconsider adopting the application to grow their business internally.

Microsoft is killing support for Internet Explorer 8, 9 and 10, finally. There'll be no more patches after January 12: It's a mercy killing. There are a few Windows versions where you need to keep running old versions of IE, so Redmond is making an exception. But Microsoft really really wants you to upgrade to Windows 10 and Edge, anyway. In IT Blogwatch, bloggers go check on grandma's PC. Your humble blogwatcher curated these bloggy bits for your entertainment.

Recently we announced how Impala provides a compelling new platform for Data Discovery and Analytics. Today, we are happy to host this guest blog from Shant Hovsepian, CTO and Co-founder of Arcadia Data, a contributor to the Apache Impala project.

Amazon has officially entered the semiconductor market with the news that it is selling its own-brand of ARM-based chips to manufacturers. U.S. tech companies went transatlantic on more than a few occasions last year in their search for innovation, but one of the more curious cases was when Amazon headed to Israel and snapped up Annapurna Labs for a reported $350 million. The chip-design startup was founded in 2011, though it had largely operated in stealth mode in the years since.

News: The move will enable Whirlpool to further improve performance and optimise supply chain.

Starred articles are new additions or updated content, posted between Thursday and Sunday. The weekly digest has 6 sections: (1) Featured Articles and Case Studies, (2) Featured Resources and Technical Contributions, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content. The full version is always published Monday. The picture of the week is from the contribution marked with a +, where you will find the details.  Announcements Don't get left behind!

IBM is today announcing new partnerships with four companies that will be working with its Watson "cognitive computing" technology for analyzing data. Sports apparel company Under Armour will put its Connected Fitness user data and data from its other sources through Watson to provide "timely, evidence-based coaching about health and fitness-related issues, including outcomes achieved based on others 'like you.'" \

This entry was posted in News and tagged , , , , , , , , , , , , . Bookmark the permalink.