Big Data News – 19 Jul 2016

Today's Infographic Link: The Chemistry of Planetary Atmospheres

Featured Article
Companies competing on data need the right skillsets and mindsets in place to succeed over the long term. While more individuals are analyzing data as part of their jobs, their ability to do so varies greatly, even among peers. We've identified 10 key traits of an analytical mind, and explain what to look for in your next hire and which skills to cultivate in your own career.

Top Stories
In this contributed article, Steve Kearns, Senior Director of Product Management at Elastic, looks at the idea of using relevance in graph exploration to open the door to asking more complex, and valuable questions.

Sisense, disrupting the BI market by simplifying business analytics for complex data, announced significant advancements to its platform to simplify the consumption of BI insights using IoT devices and Artificial Intelligence (AI) that create innately human sensory perceptions.

Retired Air Force Col. Lee Ellis says the one essential leadership characteristic that's most lacking in corporate America today is courage.

Data Science is everywhere. The explosive growth of the digital world requires professionals with not just strong skills, but also adaptability and a passion for staying on the forefront of technology. Data can be used simultaneously by many people, does not spoil, and new value and uses can be realized only after the numbers have been crunched.

It is important to know what your employees are doing. That much seems obvious, right? But in today's world with its increase in telecommuting and mobile working, keeping up with everybody is rarely as easy as simply just walking over to someone's desk and asking for a quick update on project so-and-so. Note, however, that we did not say that keeping up with everyone is impossible. It's complicated, sure, but here are a few tips that can help make it work for you.Instant Messaging

Walmart operates in India through 21 cash & carries wholesale stores under the brand name of Best Price Modern Wholesale stores.

Li-Fi claims to be 100 times faster than standard Wi-Fi. But what exactly is it and how does it work?

For a long time, I was not convinced about the power of the public cloud. Naturally! I, like many others, thought that it was a sideshow, and one which would mostly cater to startups and some medium size companies. However, I discovered I could not be further from the truth. The cloud, from the very beginning, has been confusing term, at least to me, due to two different sources, one from SalesForce and another from Amazon Web Services. The Salesforce.com original CRM portal was what became the definition of Software as a Service (#SaaS).

by Lane Leskela, Global Business Development Principal, Governance, Risk and Compliance Have you had the time and the good fortune to run across The Dishonesty Project (an endeavor of behavioral scientist Dan Ariely at the Department of Psychology at Duke University) that's studying the conditions and characteristics of lying, cheating and stealing? If not, I…

Splice Machine, the relational SQL database system that uses Hadoop and Spark to provide high-speed results, is now available in an open source edition. Version 2.0 of Splice Machine added Spark to speed up OLAP-style workloads while still processing conventional OLTP workloads with HBase. The open source version, distributed under the Apache 2.0 license, supplies both engines and most of Splice Machine's other features, including Apache Kafka streaming support. However, it omits a few enterprise-level options like encryption, Kerberos support, column-level access control, and backup/restore functionality.

C-level briefing: HeliOffshore CEO Gretchen Haskins explains how her company is using Jive and Tonic technology to boost safety for offshore aviation.

Genpact will serve as UpGrads' "knowledge partner," and the two companies will work closely together to create case studies, guest lectures, and other curriculum.

Dave's Hadoop Summit San Jose 2016 Retrospective – Part 2In this second part, we discuss the sessions that Dave attended at the San Jose Hadoop Summit and we go in depth on some related topics. Since we ran over an hour with the main topic, and we did not want to make this a three-parter, we decided to forgo the questions from the audience just this one time…

IBM's revenue continued to decline in the second quarter but growth in some of its strategic initiatives like cloud computing and data analytics suggest that the company may be on track in its transition plans. The Armonk, New York, company said Monday that revenue from its new "strategic imperatives" like cloud, analytics and security increased by 12 percent year-on-year to $8.3 billion. That increase was, however, lower than the growth the company had reported in these businesses in the first quarter.

To improve their growth rates, some up-and-coming companies are turning to a pursuit known as growth hacking. Growth hacking brings together the ideas of hacking big data and driving business growth. By one common definition, growth hacking is a process that drives rapid experimentation across marketing channels and product development to identify the most effective, efficient ways to grow a business.[1] Growth hacking often leverages customer data in the experimentation process, in the form of A/B testing.

While consulting on machine-learning and data-mining initiatives with companies around Australia and New Zealand I commonly come across an objection to the proposals we are putting forward. This objection is that the model we are seeking will not be 100% accurate — that is to say that it will not be the perfect model. This objection is entirely valid and at the same time entirely irrelevant. It is true to say that even the data a company collects are never 'perfect'.

By now most people have either created their configuration management solution or are just embarking on this journey. In his session at @DevOpsSummit at 19th Cloud Expo, Marco Ceppi, a DevOps Engineer working at Canonical, will discuss how to take configuration management to the next level with modelling and orchestration. He will also discuss how and why people are moving from a machine-centric view to a service/application-oriented view of deployments, and how you can leverage the knowledge and tools used at the machine level to expand to the scale-out, service-oriented architecture.

"Julia is a great tool." That's what New York University professor of economics and Nobel laureate Thomas J. Sargent told 250 engineers, computer scientists, programmers, and data scientists at the third annual JuliaCon held at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL). If you have not yet heard of Julia, it is not a "who," but a "what." Developed at CSAIL, the MIT Department of Mathematics, and throughout the Julia community, it is a fast-maturing programming language developed to be simple to learn, highly dynamic, operational at the speed of C, and ranging in use from general programming to highly quantitative uses such as scientific computing, machine learning, data mining, large-scale linear algebra, and distributed and parallel computing.

Whether they're located in a public, private, or hybrid cloud environment, cloud technologies are constantly evolving. While the innovation is exciting, the end mission of delivering business value and rapidly producing incremental product features is paramount. In his session at @DevOpsSummit at 19th Cloud Expo, Kiran Chitturi, CTO Architect at Sungard AS, will discuss DevOps culture, its evolution of frameworks and technologies, and how it is achieving maturity. He will also cover various styles and stacks in DevOps with examples and live demos — on AWS using tools/techniques for continuous integration, configuration management and delivery orchestration. Come prepared to have some fun and walk away with ideas to leverage and /or implement in your organization.

Object-based storage systems let IT build private clouds that scale using the type of storage systems used by public cloud service providers.

Things are better between the two nations — but nowhere near perfect

It's summertime, which means you may not be keeping up with all the news in the busy cloud computing industry. Last week there were a handful of announcements that flew somewhat under the radar, but have important implications for this market. Congress considers moving Federal IT to the cloud There's a movement afoot in Congress to encourage more government workloads to migrate to cloud computing platforms, according to GovInfoSecurity.com. A bill named Move IT Act aims to sure up cybersecurity defenses and upgrade legacy infrastructure systems while making it easier for federal agencies to use cloud computing services.

As the final moments of Rutger Hauer's tears in the rain monologue come to a close in Blade Runner, Netflix (or your streaming service of preference) has lined up some recommendations for your next viewing choice. From 2001: A Space Odyssey to The Matrix, the site's algorithms find you similarly cerebral films that you may enjoy…or you may not. The stakes are low in this situation.

Time is ticking Data Driven Business super early bird registration ends THIS Friday. Be sure to register before the clock runs out at midnight on Friday, July 22nd. Use Code PATIMES16 for an extra 15% off lowest pricing. DDB is composed of 5 co-located events at the New York Javits Center, October 23-27, 2016…. The post Predictive Analytics for Business, Financial Services and Healthcare Coverage at Data Driven Business NYC appeared first on Predictive Analytics Times.

Perhaps the time has come to change the dialogue a bit and admit that security threats are multi-layered.

Distributed Denial-of-Service (DDoS) attacks have become the primary threat to the availability of networks and online services and the peak attack sizes have grown by a factor of more than 50 over the last 10 years. Today, botnets and easy-to-use tools for launching DDoS attacks have enabled a big increase in the number of attacks and the traffic they generate. As the attacks have evolved, so have the tools and services to stop them. The best current practice is to implement a strong hybrid defense that tightly integrates on-premise equipment for always-on protection with a cloud-based mitigation solution capable of handling the largest and most complex attacks.

Digital transformation is unlike technology developments of the past, primarily because it involves much more than technology.

Here's a little puzzle that might shed some light on some apparently confusing behaviour by missing values (NAs) in R: What is NA^0 in R? You can get the answer easily by typing at the R command…

When you Google "Kalman Filter AND Machine Learning", very few interesting references pop up! Perhaps my search terms are not the best, perhaps Fintech guys keep such algorithms close to their vests, perhaps there is not much of work done in bringing these two incredibly powerful tools together… In any case, Part II of my new book, "Systems Analytics…

Capital Markets are the face of the financial industry to the general public and generate a large percent of the GDP for the world economy. Despite all the negative press they have garnered since the financial crisis of 2008, capital markets perform an important social function in that they contribute heavily to economic growth and are… The post Capital Markets Pivots to Big Data in 2016 appeared first on Hortonworks.

A BriefingsDirect business innovation thought leadership discussion on how companies are exploiting advances in procurement and finance services to produce new types of productivity benefits. We'll now hear from a leading industry analyst on how more data, process integration, and analysis efficiencies of cloud computing are helping companies to better manage their finances in tighter collaboration with procurement and supply-chain networks. This business-process innovation exchange comes in conjunction with the Tradeshift Innovation Day held in New York on June 22, 2016.

The new cluster templates feature in Cloudera Manager 5.7 makes creating clusters faster and easier. Often, after an Apache Hadoop cluster has been configured correctly, its admin will want to replicate the configuration in one or more clusters–whether for promoting a dev or staging cluster to production, or setting up a new production cluster with the same configuration as an existing one. For Cloudera customers, until recently the process for replicating cluster configurations was manual and error-prone. The post New in Cloudera Manager 5.7: Cluster Templates appeared first on Cloudera Engineering Blog.




It's All About Data Distribution. As experts in Massively Parallel Processing (MPP), here at VLDB Solutions we talk regularly about 'Primary Indexes' (PI) or 'Distribution Keys' (DK). They are integral to the architecture of both Teradata and Greenplum respectively, and the correct identification and employment of them is 'key' to the maximised performance of both Massively Parallel Processing (MPP) systems. But how do they work? Data Distribution Before we examine each in detail, it is first important to understand how data is stored and accessed on a MPP system, and how the distribution of data helps a system achieve true parallelism. Within a MPP system, data is partitioned across multiple servers (referred to as AMPs in Teradata, and Segments in Greenplum).

Following the success of our sold-out 2015 Roadshow, we are pleased to announce our worldwide Future of Data Roadshow 2016! The Roadshow brings the innovators driving the future of data to you and offers insightful content for both business and technical attendees. This is an invaluable opportunity to network with leaders who are transforming their business… The post Future of Data Roadshow: Coming to a city near you appeared first on Hortonworks.

It has been another exciting week on Hortonworks Community Connection HCC. We continue to see great activity and recommend the following assets from last week. Top Articles from HCC Horses for Courses: Apache Spark Streaming and Apache Nifi by:vvaks Comparing Apache Nifi and Apache Spark Streaming for different streaming and IOT use cases Data Analysis… The post Top 5 Articles and Questions from HCC appeared first on Hortonworks.

Please join me for my TDWI presentation. When: July 20th, 2016. 3:30 pm – 4:15 pm Where: TDWI Accellerate, Boston, The Westin Copy Place Registration and details Enabling Self-Service Analytics The Holy Grail of business intelligence has long been self-service analytics where business people can access and analyze information without getting stuck in a lengthy IT backlog.

DigitalOcean is making available block storage in the form of solid-state disks (SSDs) at a cost of ten cents per gigabyte per month.

The key ingredient for predictive analytics is the right dataset, but preparing and blending data from multiple sources and formats can be a time-consuming and frustrating process. Download our new cookbook, "7 Steps to Data Blending for Predictive Analytics," and get a step-by-step recipe for using Alteryx to: Access, cleanse, and join disparate data sources without coding or outside intervention Reduce data preparation time, leaving more time for building predictive models Build datasets for predictive modeling in Alteryx, R, SAS, or SPSS Learn how self-service data analytics from Alteryx delivers deeper business insight in hours, not weeks. Download Now

New SQL on Hadoop Independent Performance Benchmark LIVE Webinar Wednesday, July 20th 12PM ET/ 9AM PT REGISTER NOW Spark? Presto? Hive? Impala? With the number of new and rapidly evolving technologies, how do you know which SQL engine to use in order to optimize queries and workloads in your Hadoop environment? A new independent benchmark of leading SQL-on-Hadoop engines tackles this question, measuring capabilities and performance workloads in order to compare which tool is the best to use for different needs.

"RackN is a software company and we take how a hybrid infrastructure scenario, which consists of clouds, virtualization, traditional data center technologies – how to make them all work together seamlessly from an operational perspective," stated Dan Choquette, Founder of RackN, in this SYS-CON.tv interview at @DevOpsSummit at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.

Inquidia Consulting announced the release of a new software component that allows developers of big data architectures to easily integrate new data sources into Snowflake's cloud-native data warehouse and analytic engine.

Above the Trend Line: machine learning industry rumor central, is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items such as people movements, funding news, financial results, industry alignments, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

While many government agencies have embraced the idea of employing cloud computing as a tool for increasing the efficiency and flexibility of IT, many still struggle with large scale adoption. The challenge is mainly attributed to the federated structure of these agencies as well as the immaturity of brokerage and governance tools and models. Initiatives like FedRAMP are a great first step toward solving many of these challenges but there are a lot of unknowns that are yet to be tackled.

It wasn't long ago that the entertainment industry had virtually no information on the end consumers of its products. A studio would create a television show, which would then be transmitted over the airwaves or through a satellite or cable feed. Even the broadcasters had virtually no information on which consumers watched what content. Outside of high level ratings and survey data, the industry was in the dark ages when it came to customer analytics and insight and there was little opportunity for customer engagement of any kind.

The key to operating a thriving enterprise is continuously managing performance.Measuring and monitoring business performance relies on a process of defining clear objectives, planning and budgeting, and evaluating progress at established intervals to ensure the company is keeping pace with their outlined objectives.

RDBMS-on-Hadoop database Splice Machine onboards Apache Spark and goes open source. Is it trying to be all things to all people, or is it just combining a set of raw technologies and making them useful and readily available?

In this special guest feature, Rob Whiteley is the VP of Marketing at Hedvig, discusses the challenges businesses face with IoT and data storage such as figuring out where all of the IoT data will go and how it will be stored and protected.

This entry was posted in News and tagged , , , , , , , , , , , . Bookmark the permalink.