Big Data News – 28 Apr 2016

Today's Infographic Link: Where did the Money Go?

Top Stories
There has never been a better time for innovation in financial services, yet most financial institutions struggle to build lasting, profitable relationships with their customer base. New ideas abound, yet actually implementing those new ideas, let alone realizing their financial promise remains elusive. It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way. – Charles dickens.

Olivier Bonsignour on what "X-Raying" software means, how it can help prevent software disasters and why CIOs should care. By Olivier Bonsignour

Samsung and Microsoft have crossed paths in the smartphone and tablet markets, and will now do battle in the cloud. Samsung on Wednesday announced the Artik Cloud service for businesses, which the company hopes will give it a strong position in the emerging Internet of Things market. In IoT, it will take on cloud services like Microsoft's Azure and IBM's Bluemix.

ServiceNow is one of the rare cloud companies to have reached the $1 billion annual revenue milestone — a success story built on delivering next-generation IT service management capabilities. Now, the Santa Clara, Calif., company is aiming far beyond IT shops and wants to change how virtually every other corporate department does business.

Rachel Reese on the challenges and benefits of using microservices at Jet. In particular how F# made it easier to refactor and maintain hundreds of microservices. The hard bit is the infrastructure. By Rachel Reese

Microsoft has partnered with a San Francisco-based company to encode information on synthetic DNA to test its potential as a new medium for data storage. Twist Bioscience will provide Microsoft with 10 million DNA strands for the purpose of encoding digital data. In other words, Microsoft is trying to figure out how the same molecules that make up humans' genetic code can be used to encode digital information.

Operational Data Stream and Batch Processing at Netflix with Mantis By Dylan Raithel

Oh, the poor, maligned pie chart. The chart type that gets pushed around and bullied on the data-viz playground more than any other. Randal Olsen of /r/dataisbeautiful ran a Twitter poll asking, "Do you think pie charts should be banned from #dataviz?" Scientific or not, nearly two in five responded affirmatively: That's amazing if you stop and think about it. Almost 40 percent of respondents, likely mostly data viz enthusiasts who follow Olsen, think that pie charts should never, ever, ever be used.

Today OpenAI, a non-profit artificial intelligence research company founded by InfoSys and Amazon Web Services, announced a beta for OpenAI Gym. Gym is a Python based toolkit for developing and comparing reinforcement learning (RL) algorithms offered under the MIT license.

Guest blog post by Laetitia Van Cauwenberge Very interesting compilation published here, with a strong machine learning flavor (maybe machine learning book authors – usually academics – are more prone to making their books available for free). Many are O'Reilly books freely available. Here we display those most relevant to data science. I haven't checked all the sources, but they seem legit.

Guest blog post by SupStat Contributed by the neuroscientist Sricharan Maddineni. He holds huge passion and talents in data science. Thus he took NYC Data Science Academy 12 weeks boot camp program between Jan 11th to Apr 1st, 2016. The post was based on his second project, which posted on February 16th (due at 4th week of the program). He acquired the publicly transportation data and consult from social media. Consuming the data through his mind, he visualized the economic and business insights. Why Are Airports Important?

Qubole, the big data-as-a-service company, and Looker, the company that is powering data-driven businesses, today announced that they are integrating Looker's business analytics with Qubole's cloud-based big data platform, giving line of business users across organizations access to powerful, yet easy-to-use big data analytics.

Mozilla has released Firefox 46, including improved security of the JavaScript JIT Compiler, and delaying Addon signing.

The embrace of stream processing and real-time data access is driving enterprise adoption of the Apache Kafka distributed messaging system, according to an industry survey released at a Kafka summit this week. Confluent Inc., the Palo Alto, Calif., company founded by the creators of Apache Kafka, reported that its industry survey revealed that 88 percent of companies polled said they expect to adopt the tool for their data and application infrastructures by 2017. Moreover, nearly one-third of respondents said they work for large companies with annual sales of more than $1 billion.

Does enterprise IT in the government sector need to be so complex? Maybe not. Find out how advancements in cloud computing, open data initiatives and valuable lessons learned from smartphones and the app store model can transform analytics workloads in enterprise IT for government agencies.

Online harassment is a serious issue, one that the engineers and designers behind the keyboard don't always think about when building software. Machine learning is become more prevalent but as more technology companies take advantage of it, they risk alienating their users even more by presenting content that isn't actually relevant.

Concourse, an open source CI pipeline tool that uses yaml files for configuring pipelines and configuration-free setup, has recently bumped its major release and is currently available in version 1.1.0. Major conceptual benefits of Concourse are explicit and first-class support of pipelines, running isolated builds in containers, avoidance of snowflake build servers and easy access to build logs. By Grischa Ekart

If IT can learn to use and value analytics tools, it can better perform a critical bridge role: helping users learn to properly use the tools.

Six key issues that CIOs and CMOs are collaborating on to preserve positive online customer experiences and protect their businesses.

Data Warehouse Environments (DWEs) need to undergo constant change to keep up with new business requirements and technologies. Server upgrades and adding new platforms cobbled into existing environments indicates a clear trend toward continuous modernization. But this trend poses a unique challenge to the data management professionals responsible for ensuring a smooth evolution. To help them along their path, we looked over more than 400 survey responses from like-minded professionals and discovered some interesting observations.

Graph database technology powered by open source initiatives is helping fraud detection units catch intruders in the act of breaching data security. Tune in for an enlightening discussion of how modern approaches to analytics are bringing descriptive and predictive analytics together to help stop fraud as it occurs.

The U.S. House of Representatives, in a rare unanimous vote, has approved a bill to strengthen privacy protections for email and other data stored in the cloud. The Email Privacy Act would require law enforcement agencies to get court-ordered warrants to search email and other data stored with third parties for longer than six months. The House on Wednesday voted 419-0 to pass the legislation and send it to the Senate. The bill, with 314 cosponsors in the House, would update a 30-year-old law called the Electronic Communications Privacy Act (ECPA). Some privacy advocates and tech companies have been pushing Congress to update ECPA since 2011.

Previous Relevant Posts Single regression with R to identify relationship between WTI and stock price of Exxon Getting stock volatility in R & Getting Histogram of returns What is CAPM? According to the investopedia (, The capital asset pricing model (CAPM) is a model that describes the relationship between risk and expected return and that is used in pricing of risky securities. (Generally, you can understand that there is a linear relationship between risk (stock volatility) and the stock return.)

SnappyData today announced it has received $3.65 million in Series A funding to build a business around its real-time analytics platform that combines Apache Spark, Pivotal's GemFire data grid, and an innovative data approximation method that makes big data more manageable. SnappyData was spun out of Pivotal earlier this year to pursue development of its technology platform, which essentially combines the data manipulation capabilities of Apache Spark and Spark Streaming, the in-memory persistence of the GemFire's data grid, and an approximate query processing (AQP) method that makes huge gobs of data manageable and query-able.

One of the findings in the 2016 DBIR is that old vulnerabilities continue to be leveraged.

Intel is making it easier to create smarter and more functional gadgets, robots, drones, and wearables using its Edison developer board. The company has made a series of improvements to its latest IoT Developer Kit 3.0, which is used to program functionality into devices. The developer kit adds support for a wider range of sensors and adds connectivity to IBM's Bluemix cloud service. The kit also has improved programming tools and integration with Google's Brillo and Android. Edison has been used as a developer board to prototype and test devices. The new features provide a springboard to make Edison a viable platform for end products. The board could be used in products such as smart helmets, but it is too big for small electronics and some wearables.

Redis Labs today announced the general availability of Redis on Flash with standard x86 servers, including standard SATA-based SSD instances available on public clouds and more advanced NVMe based SSDs like the Samsung PM1725.

Migration will continue to draw increased development activity aimed at making cloud adoption less burdensome for the enterprise.

Sales Cloud targets sales reps focused on action, not details. While the app has dominated CRM, there are some areas of needed improvement, such as mobile.

Today at the Samsung Developers Conference, Codenvy announced the first public release of the Samsung Artik IDE which allows building applications for the Samsung Artik IoT devices.

When people think of "data science" they probably think of algorithms that scan large datasets to predict a customer's next move or interpret unstructured text. But what about models that utilize small, time-stamped datasets to forecast dry metrics such as demand and sales? Yes, I'm talking about good old time series analysis, an ancient discipline… The post Sorry ARIMA, but I'm Going Bayesian appeared first on Predictive Analytics Times.

Samsung and Microsoft have crossed paths in the smartphone and tablet markets, and will now do battle in the cloud. Samsung on Wednesday announced the Artik Cloud service for businesses, which the company hopes will give it a strong position in the emerging Internet of Things market. In IoT, it will take on cloud services like Microsoft's Azure and IBM's Bluemix. Simply put, the Artik Cloud provides the tools needed for companies to securely collect, store and analyze telemetry data collected from a wide range of sensors.

Metadata and governance might not have a long history of setting hearts ablaze, but those who are recognizing their importance to self-service are looking to metadata to help organizations fully leverage a wide range of assets across the business. Discover why metadata and governance are taking a central role in the information lifecycle.

by Andrie de Vries A few weeks ago I wrote about the growth of CRAN packages, where I demonstrated how to scrape CRAN archives to get an estimate of the number of packages over time. In this post I…

The team behind Pivotal's GemFire in-memory transactional data store recently unveiled a new database solution powered by GemFire and Apache Spark, called SnappyData. SnappyData is another example of the way Spark has recently been employed as a component in a larger database solution, with or without other components from Apache Hadoop. [ Unleash the power of SQL with 17 tips for faster queries.

This entry was posted in News and tagged , , , , , , , , , , . Bookmark the permalink.