Big Data News – 06 Mar 2017

Today's Infographic Link: Codebases: Millions of Lines of Code

Featured Article
Big data is much maligned, misused and misunderstood or at least this is what a raft of the industry’s best and brightest from the DataIQ100 index believe. There is hope though. While Facebook’s position is clear not everyone has the luxury of billions of users (or dollars) to interrogate. What are businesses that deal with the likes of you and me doing with the data we give them and where is it all heading? Here are what the leaders of the industry are saying:

Top Stories
March 4, 2017 is Open Data Day. Open Data Day is an annual celebration across the globe. Over 300 groups around the world schedule activities to use open data for their communities. See if there is a gathering in your area. Also, the focus this year is on: Open research data Tracking public money flows Open data for environment Open data for human rights Good Luck! Share List

Modernize your payment platform with IBM Financial Transaction Manager, the optimal payment platform for a bank's path to business growth in the digital age. Financial institutions can introduce new real-time schemes and payment types continually, support existing and emerging transaction processing, enhance cognitive fraud detection and payment insights, and foster emerging blockchain initiatives.

I don't play music. I can barely sing. (Don't accept a Rock Band party invitation from me unless you have no further need of your eardrums.) And I certainly can't read music. So I'm sure I'm missing a lot of in-jokes here, but it was still informative and funny. (Via Sploid.)   Have a great weekend! We'll be back on Monday with more blog goodness.

Last year, Drexel University and CIO.com teamed up to present the first Analytics 50 Awards program. We're proud of our partnership with our friends at Drexel's Lebow College of Business and proud of the 50 innovative winners from our debut program, which included such impressive organizations as AstraZeneca, Farmers Insurance, GE, Children's Hospital of Philadelphia, Major League Soccer and UPS — to name just a few. We may be a bit biased, but we think the Analytics 50 is great example of what can happen when media and academia collaborate. Drexel University's Decisions Sciences department and CIO.com both remain committed to recognizing excellence in analytics and its real-world applications. We love the Analytics 50, but the year-long coverage of analytics on CIO.com and the range of business analytics programs offered by Drexel shows we share the passion exemplified by last year's winners.

The machines indeed may be able to take over, as machines are abetted by hackers.

While R's base graphics library is almost limitlessly flexible when it comes to create static graphics and data visualizations, new Web-based technologies like d3 and webgl open up new horizons in high-resolution, rescalable and interactive charts. Graphics built with these libraries can easily be embedded in a webpage, can be dynamically resized while maintaining readable fonts and clear lines, and can include interactive features like hover-over data tips or movable components. And thanks to htmlwidgets for R, you can easily create a variety of such charts using R data and functions, explore them in an interactive R session, and include them in Web-based applications for others to experience. This htmlwidgets showcase allows you to try out a few such charts, but to see the full diversity of charts available check out the new htmlwidgets gallery. It's an index of packages on CRAN and on Github that create htmlwidgets visualizations.

While AI, IoT and anomaly detection are hot, mere capability has been championed so far. The data volumes involved require optimized approaches, though, and few have emerged. ExtraHop's new "Addy" product changes all that.

Post-cloud marks a decades long transition away from centralized infrastructure to distributed architectures that will inhabit virtually everything.

Bob Laurent, VP of product marketing at Alteryx, says that as LOB execs rely more on advanced analytics, their tolerance for poor data is dropping.

Thales CTO for e-Security Jon Geater says both Accenture and Thales expect to see blockchain software applied across a range of vertical industries.

Is your organization ready to deploy analytic models at scale? Are your existing systems connected in the right ways to leverage the latest analytics capabilities? Join us for a live webinar detailing the creation and deployment of gradient boosting machine models using Python, Kafka and FastScore. This webinar, led by Open Data's Matthew Mahowald, will increase your understanding of the benefits of gradient boosting as well as the easiest way to deploy and maintain a live streaming gradient boosting machine model in production systems. Our webinar will focus on providing 3 key takeaways: Learn how to create a gradient boosting machine using SciKit Learn and Python Understand the steps required to transform features, train, and deploy a GBM using FastScore, a language agnostic analytic engine.


DataRefuge is a public collaborative, grassroots effort around the United States in which scientists, researchers, computer scientists, librarians and other volunteers are working to download, save, and re-upload government data. The DataRefuge Project, which is led by the UPenn Program in Environmental Humanities and the Penn Libraries group at University of Pennsylvania, aims to foster resilience in an era of anthropogenic global climate change and raise awareness of how social and political events affect transparency.

In anticipation of her upcoming conference presentation, Our Success with Agile Analytics at Predictive Analytics World for Business Chicago, June 19-22, 2017, we asked Afsheen Alam, Program Manager Marketing Analytics and Big Data at Allstate Insurance, a few questions about her work in predictive analytics. Q: In your work with predictive analytics, what behavior or outcome do your models predict?

The Data Incubator, a data science fellowship program, is currently running a Data Science in 30 minutes webinar series. Next week features a free webinar with Dr. Becky Tucker of Netflix. Dr. Tucker is a Senior Data Scientist at Netflix where she specializes in predictive modeling for content demand (think what do people want to watch). The full abstract of the webinar is below. The webinar is free.

The Machine research project from HPE may usher in a new era for IT and big data processing. This guide covers what business professionals need to know about The Machine.

Taken together, however, several changes at the FCC show how quickly and decisively the new administration is setting a new course.

Amazon Web Services said today its outage earlier this week that affected major websites and apps was caused by human error. Sites including Netflix, Reddit and the Associated Press struggled for hours on Tuesday — all because of a simple typo. "While we are proud of our long track record of availability with Amazon S3, we know how critical this service is to our customers, their applications and end users, and their businesses," the company wrote in an online message. "We will do everything we can to learn from this event and use it to improve our availability even further."

One of the reasons cybercriminals have the advantage is because the incentives between the attackers and the defenders are mismatched.

A large majority of security operations centers, SOCs, have not attained the requisite level of maturity to adequately protect against cyberattacks.

I haven't been admitted to hospital many times in my life, but every time the only thing I really cared about was: when am I going to get out? It's also a question that weighs heavily on hospital managers: by knowing ahead of time how long each patient's stay is likely to be, they can better manage facilities and staff, and know whether the hospital is likely to reach maximum capacity in the near future. To help hospital administrators better predict how long patients are likely to stay, Microsoft has published the Predicting Length of Stay in Hospitals solution on the Cortana Intelligence Gallery. Clicking on "deploy" creates an instance of the Data Science Virtual Machine with simulated patient data in SQL Server, and a model implemented with R Services to predict the length of stay. The predictions are then presented as a Power BI dashboard to a Care Line Manager or a Chief Medical Information Officer as shown below. (Click the Try It Now button on this page to interact with the dashboard.)

A few weeks ago I opined, as is my want, about what I saw happening in the technology space with regard to Platform as a Service (PaaS). As I saw it, PaaS was pretty much a concept that had been superseded by newer approaches to application and infrastructure creation and management. You'd be forgiven for thinking that two of the best known PaaS offerings, Cloud Foundry and OpenShift, would be pretty antsy about such a bold claim. Surprisingly, that doesn't seem to be the case. In fact, Abby Kearns, the chief executive of the Cloud Foundry Foundation, asked to jump on a call with me to opine exactly why she, who runs a foundation which itself shepherds an initiative that most concur is PaaS, agrees wholeheartedly with my view.

There is a growing need for versatile, hybrid architectures that can combine the best of both data warehousing and big data analytics. The cloud is the perfect solution, because it makes it easier to build a robust data warehouse as a central "hub", and then add other environments that can be scaled up or down to meet the specific needs of different datasets. Nevertheless, it is important to think carefully about the design of the entire hybrid architecture, and avoid a number of common pitfalls. We spoke to Jim Kobielus at IBM for his top tips on strategizing and optimizing cloud data warehouses.

Why would you want to use blockchain to build a database solution? And how would you actually do that? BigchainDB has answers.

Hewlett Packard Enterprise has revamped its existing technology services unit to focus on helping customers adopt emerging technologies, including cloud computing, the internet of things, and big data. HPE's new Pointnext technology services division, announced Thursday, is designed to help businesses speed up their adoption of several technologies, also including hybrid IT services and analytics, the company said. HPE announced the rebranded services unit with an "unboxing" video.
This entry was posted in News and tagged , , , , , , . Bookmark the permalink.