Big Data News – 08 Jun 2016

Today's Infographic Link: Rockets of the World

Featured Article
The Apache Software Foundation recently released its 28-page annual report for its 2015-2016 year, but here's the TL;DR in one word: amazing. What started as a simple HTTP server supported by a handful of developers in 1995 has become an army of 3,425 ASF committers and 5,922 Apache code contributors building 291 top-level projects.

Top Stories
Dataiku has announced the release of DSS 2.3, improving the user experience and efficiency of their data analysis software suite — with a strong focus on enhancing and facilitating data preparation features for both expert and beginner analysts.

Salesforce Lightning is a framework designed to enable end users to both add functions and customize their Salesforce applications.

At TrailheaDX, Salesforce's first developer conference, the company is encouraging people without programming experience to create apps using Lightning.

In this special technology white paper by Bit Stew Systems, "Why Machine Intelligence is the Key to Solving the Data Integration Problem for the IIoT," you'll uncover the reasons why machine intelligence is the key to solving the data integration challenge for the IIoT.

In tandem with its RecoverX™ launch, Datos IO, a provider of next-generation data protection solutions, announced a new strategic partnership with DataStax, the company that delivers enterprise Apache Cassandra™.

At Spark Summit in San Francisco, Calif., this week, Hadoop distribution vendor MapR Technologies announced a new enterprise-grade Apache Spark distribution. The new distribution, available now in both MapR Converged Community Edition and MapR Converged Enterprise Edition, includes the complete Spark stack, patented features from MapR and key open source projects that complement Spark.

Apple Siri "helped save" a toddler in Australia, who'd stopped breathing. Her mom yelled at her iPhone to call an ambulance. Stacey Gleeson's daughter Giana Gleeson is well again. Apple PR is celebrating a good-news story for a change. In IT Blogwatch, bloggers unpick what actually happened. Your humble blogwatcher curated these bloggy bits for your entertainment.

Analysis: Containers, IoT, servers, and big data were all put in the spotlight for the Discover conference keynote.

If there's an overriding trend in the world of enterprise software lately, it's democratization, as tools previously reserved for experts are put in the hands of average users. On Tuesday, Salesforce.com climbed on board with new software, training and support services that aim to help more users — not just professional developers — build applications for the Salesforce platform. There aren't enough trained developers to create apps for the business world, the company says, so it wants to help users in all parts of the organization make their own.

Here's this week's news on Big Data. Don't forget to subscribe if you find this useful! Interesting Data Science Articles and News Computer science class fails to notice their TA was actually an AI chatbot — "Jill Watson," a chatbot powered by IBM Watson, was added to the list of teaching assistants for Ashok Goel's online course. The chatbot was so good at answering questions that students did not notice their TA was not human.

LinkedIn is releasing to the open source community its machine-learning tool used to train the ranking algorithm for its newsfeed, advertising and customer recommendations. The world's largest professional network (NYSE: LNKD) said Tuesday (June 7) its Apache Spark-based machine learning library dubbed Photon ML would give data scientists a more accurate picture of underlying datasets as they train algorithms to parse the backgrounds of individual users.

It's been a tumultuous past year for Hewlett Packard Enterprise but this week the company is unveiling a series of new offerings intended to solidify its standing in the private and hybrid cloud computing market. Key themes for HPE's new cloud products are bundling software to make it more easily consumable, packaging that software with optional hardware to create an all-in-one cloud and being able to manage not only new, cloud-native workloads and technologies — such as application containers — but legacy and traditional workloads too.

Take a closer look at the explosion of zero-day threats and how deep learning can help organizations better protect their valuable cyber assets.

Five questions CFOs and CIOs need to ask themselves to better prepare their infrastructure to support an IoT-centric model.

Download our "Clean Your Data Faster for Tableau" whitepaper to see how Alteryx enables you to spend less time preparing data and more time analyzing and visualizing data. Alteryx easily solves these three commonly encountered problems when preparing your data for Tableau: Data Cleansing: Removing and correcting bad data

IBM (NYSE:IBM) today announced the first cloud-based development environment for near real-time, high performance analytics, giving data scientists the ability to access and ingest data and deliver insight-driven models to developers.

When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, will discuss the IT and business benefits that cloud bursting provides, including increased compute capacity, lower IT investment, financial agility, and, ultimately, faster time-to-market.

Who crunches more data faster, wins. It's this drive that cuts through and clarifies the essence of the evolutionary spirit in the computer industry, the dual desire to get to real time with bigger and bigger chunks of data. The locomotive: HPC technologies adapted to enterprise mission-critical data analytics. With its memory capacity of up to 24TB in an eight-socket system, the largest capacity in the industry according to Intel, the new Xeon processor E7-8800/4800 v4 allows massive datasets to be stored in memory in support of real-time data analytics applications.

The cloud is not a requisite for the digital transformation, but it sure makes it a whole lot easier.

Self service, business led BI is big and getting bigger. Gartner predicts that market will be worth billions and the number of vendors offering these products will constantly grow and expand. Gartner's report correctly points out that we are in state of more data from more sources; where enterprises want more users enabled to explore, search and discover insights. At the same time, enterprises have more data needs to be converted into business insights and strategic action.

The Affordable Care Act expanded eligibility for the Medicaid program with the hope of enrolling millions of low-income U.S. residents who could not access insurance. California was one of the states that chose to expand. Researchers at the University of California-San Francisco conducted a study that investigated the trends in the association between insurance coverage and usage of emergency departments among adults 18-64 from 2005 through 2010. They found that ED utilization in California increased at a faster rate for adult Medicaid enrollees than for the privately insured or the uninsured.

The system combines satellite and air traffic data to ensure that a drone is flying in the clear.

Kcore Analytics is a website that developed an end-to-end framework to find the Influencers in complex social networks like Twitter. The big innovation in their approach is extracting, solely from influencers, summarized data to forecast global trends, from the markets to consumer products to social movements and revolutions. Here is the list of the top 10 influencers in Data Science

As Spark Summit continues in San Francisco so too do the connector announcements. Pretty soon connecting with Spark will be as common as connecting to Excel.

At Spark Summit in San Francisco, Calif., yesterday, Hadoop distribution vendor MapR Technologies announced a new enterprise-grade Apache Spark distribution. The new distribution, available now in both MapR Converged Community Edition and MapR Converged Enterprise Edition, includes the complete Spark stack, patented features from MapR and key open source projects that complement Spark.

Data storytelling is just the latest phase in the long history of passing on knowledge to people, expert David A. Teich explains. It's just the tools that have changed.

At Hadoop Summit San Jose we are excited to be joined by industry experts from the industry. Here are just a few of the business focussed sessions, but you need to register to attend Hadoop Summit. What is Data? And What Are You Doing?

June has got off to a great start – and not only because it seems like summer has arrived in London! Yesterday, our team gathered in our International HQ a stone's throw from Liverpool Street Station for a session with Mike Schiebel, our cyber security strategist, who is visiting from the west coast. We have…

Policies can't be considered as "just another piece of paper" as they're really the foundation of your governance, risk and compliance (GRC) program. They define the scope in which your GRC program operates and impersonate the G in GRC! They can be preventive, like a Code of Conduct to try and reduce business risks; detective,…

We are proud to announce the technical preview of Spark-HBase Connector, developed by Hortonworks working with Bloomberg. The Spark-HBase connector leverages Data Source API (SPARK-3247) introduced in Spark-1.2.0. It bridges the gap between the simple HBase Key Value store and complex relational SQL queries and enables users to perform complex data analytics on top of… The post Spark-on-HBase: DataFrame based HBase connector appeared first on Hortonworks.

Hewlett Packard Enterprise on Tuesday stepped up its efforts to develop a brand-new computer architecture by inviting open-source developers to collaborate on the futuristic device it calls "The Machine." Originally announced in 2014, The Machine promises a number of radical innovations, including a core design focus on memory rather than processors. It will also use light instead of electricity to connect memory and processing power efficiently, HPE says. A finished product won't be ready for years still, but HPE wants to get open-source developers involved early in making software for it. Toward that end, it has released four developer tools.

Many predictive and machine learning models have structural or tuning parameters that cannot be directly estimated from the data. For example,…




It has been a couple of weeks since I joined Hortonworks through its Intern Program and wow, what a time it has been! For all the students out there who are interested in beginning a career in technology and data, I will be periodically blogging about my experiences as a Hortonworks intern. First up, I…

Hewlett Packard Enterprise on Tuesday stepped up its efforts to develop a brand-new computer architecture by inviting open-source developers to collaborate on the futuristic device it calls "The Machine." Originally announced in 2014, The Machine promises a number of radical innovations, including a core design focus on memory rather than processors. It will also use light instead of electricity to connect memory and processing power efficiently, HPE says. A finished product won't be ready for years still, but HPE wants to get open-source developers involved early in making software for it. Toward that end, it has released four developer tools.

RecoverX, now generally available, can be configured to enable IT organizations to capture data in a variety of NoSQL databases every few minutes.

\

IBM today launched what it's calling the first enterprise application for data science collaboration. Called the Data Science Experience, the free, cloud-based offering is aimed at enabling data scientists to perform tasks like prepping data and building machine learning models in an open and shared environment. Developed on Apache Spark, IBM likens the Data Science Experience to an integrated development environment (IDE) where data scientists have a place to do their work. Data scientists can work with the software, which is largely based on the open source Jupyter notebook, using a variety of languages, including R, Python, and Scala.

SYS-CON Events announced today that Violin Memory®, Inc., will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Violin Memory, the industry pioneer in All Flash Arrays, is the agile innovator, transforming the speed of business with enterprise-grade data services software on its leadership Flash Storage Platform™. Violin Concerto™ OS 7 delivers complete data protection and data reduction services and consistent high performance in a storage operating system fully integrated with Violin's patented Flash Fabric Architecture™ for cloud, enterprise and virtualized business and mission-critical storage applications.

This entry was posted in News and tagged , , , , , , , , , , . Bookmark the permalink.