Big Data News – 03 Aug 2016

Featured Article
Summary: Deep learning and Big Data are being adopted in law enforcement and criminal justice at an unprecedented rate. Does this scare you or make you feel safe? When you read the title, whether your mind immediately went for the upstairs "H" or the downstairs "H" probably says something about whether the new applications of Big Data in law enforcement let you sleep like a baby or keep you up at night. You might have thought your choice of "H" related to whether you've been on the receiving end of Big Data in law enforcement but the fact is that practically all of us have, and for those who haven't it won't take much longer to reach you.

Top Stories
News: Brain mimicking technology builds on the idea of neuromorphic computing.

In anticipation of his upcoming conference co-presentation, Predictive Analytics for Stress Testing — Industry Challenges at Predictive Analytics World Financial in New York City, October 23-27, 2016, we asked Sanjay Gupta, Executive Vice President and Head of Model Development at PNC Bank, a few questions about his work in predictive analytics. Q: In your work with… The post Wise Practitioner – Predictive Analytics Interview Series: Sanjay Gupta at PNC Bank appeared first on Predictive Analytics Times.

Zaloni, the data lake company, announced a new solution, Zaloni's Data Lake 360°, to meet the needs of a growing number of enterprises that understand that the data lake is key to the future of the enterprise data ecosystem.

Did you know that there are 4.4 zettabytes of data on the internet? This enormous amount of data is still being handled by IT workers. By 2020 this amount will increase tenfold and traditional means of handling digital information will no longer suffice. Find out how artificial intelligence will help big data:

The term BIM, which comes from "Building Information Modeling," can be a little misleading. It's not just about buildings; BIM is the process that enables the efficient and quality design, construction and operation of a structure such as a building, bridge or highway. The core BIM principles and concepts, which were pioneered for the construction and management of buildings, are being applied in heavy civil construction, utilities, energy and more. Today, BIM is an emerging technology that is transforming the way we build. The use of 3-dimensional models and real-time management of processes and workflows delivers the efficiency that today's clients demand.

The presentation below from Hadoop Summit 2016 shares how Macy's successfully made BI work on Hadoop. You'll see how the company kept the top spot as the largest US department store by innovating with interactive, self-service Business Intelligence directly on Hadoop.

In this special guest feature, James Fisher, VP of Global Product Marketing at Qlik, advises that the good 'ole days of Excel and outdated spreadsheets are long gone, and savvy businesses need to embrace a culture of analytics in order to realize new insights.

This is part one of a two-part series of posts about using some common server storage I/O benchmark tools and workload scripts. View part II here which includes the workload scripts and where to view sample results. There are various tools and workloads for server I/O benchmark testing, validation and exercising different storage devices (or systems and appliances) such as Non-Volatile Memory (NVM) flash Solid State Devices (SSDs) or Hard Disk Drives (HDD) among others.

"If (there) was one thing all people took for granted, (it) was conviction that if you feed honest figures into a computer, honest figures (will) come…

Big Data, cloud, analytics, contextual information, wearable tech, sensors, mobility, and WebRTC: together, these advances have created a perfect storm of technologies that are disrupting and transforming classic communications models and ecosystems. In his session at Things Expo, Erik Perotti, Senior Manager of New Ventures on Plantronics' Innovation team, provided an overview of this technological shift, including associated business and consumer communications impacts, and opportunities it may enable, complement or entirely transform.

Even as they aspire to be data driven, organizations are failing to align their vision with execution. Pitfalls lurk everywhere. We've uncovered 10 of the most common culprits.

For this blog post we decided to jump on the PokemonGO hype and add a bit of science into the craze. Our goal is to give you the optimal portfolio of Pokemon to train, so you can be as effective as possible against a wide variety of opponents. As each Pokemon has its strengths and weaknesses, we created clusters of Pokemon with similar characteristics and looked at the few selected ones allowing the player to compete against as many different enemies as possible. Data We used the Pokemon API fan service available on the internet to find all the information about the little creatures.

Probably the most innovative analytics capability of the last couple of years is SAP BusinessObjects Digital Boardroom. I think I have mentioned many times here, that "your meeting will never be the same," for the very simple reason that your meeting will never be the same. The SAP BusinessObjects Digital Boardroom is the ultimate facility…

FinTech may, or may not, be the one-horse race the financial commentators are claiming. But it's certainly where the smart money's going. I mean, new startups are disrupting traditional financial working practices and driving a rapid global expansion. And a long line of banks are hoisting themselves onto the bandwagon as FinTech initiatives become an integral part of corporate strategy.

Many organizations have invested heavily in analytics, but few have seen their investment have a positive impact on the performance of sales and marketing.

Extreme Computing is the ability to leverage highly performant infrastructure and software to accelerate Big Data, machine learning, HPC, and Enterprise applications. High IOPS Storage, low-latency networks, in-memory databases, GPUs and other parallel accelerators are being used to achieve faster results and help businesses make better decisions. In his session at 18th Cloud Expo, Michael O'Neill, Strategic Business Development at NVIDIA, focused on some of the unique ways extreme computing is being used on IBM Cloud, Amazon, and Microsoft Azure and how to gain access to these resources in the cloud… for FREE!

Documents and communications are associated with metadata.

I'm often guilty of sticking my head in the sand, ignoring very small details that could turn into big problems later on. Like when Netflix told me my credit card was about to expire, but it seemed like such a boring task, so I ignored them…till the day Netflix stopped working. I grumbled at the TV for awhile, then wasted another half hour remembering my login details and finding the credit card update page.

Fleet management companies have developed strong software suites. The GPS-type services with which consumers are familiar are just the beginning.

The adage, 'change is the only constant,' holds true in the IT landscape like no other, forcing IT admins to constantly learn new skills and make strategic decisions.

Growing deployment of distributed applications on scale-out and cloud databases has, vendors claim, fueled the need to protect critical application data, resulting in data governance standards that mandate backup and recovery capabilities as part of the application stack. Those requirements prompted database recovery software specialist Datos IO to forge a partnership with public cloud giant Amazon Web Services (NASDAQ:AMZN) designed to help enterprises back up and recover multiple data stores along with cloud-native workloads on the Amazon cloud.

Become Your Office's Data Analytics Hero with These 4 Interactive Courses, discounted 85% for a limited time, from $276 down to just $39. Successful businesspeople make the most out of the numbers available to them. If you're to succeed in the business world, you're going to need how to crunch and effectively analyze data. This course bundle teaches you how to interpret critical data, walks you through exercises, shows you how to present complex data to business and customer stakeholders, gives you access to professional spreadsheet models, and much more. 

General wisdom suggests that resumes be as concise as possible, but job seekers must be careful that they don't leave out important details.

Attic Labs is working on a beta version of a Noms database that makes use of a lighter approach to sharing large amounts of structured data.

Choosing the right cloud for your workloads is a balancing act that can cost your organization time, money and aggravation – unless you get it right the first time. Economics, speed, performance, accessibility, administrative needs and security all play a vital role in dictating your approach to the cloud. Without knowing the right questions to ask, you could wind up paying for capacity you'll never need or underestimating the resources required to run your applications.

Via the cloud, Particle wants to reduce the cost of building and then managing IoT endpoints.

To be published on September 13, 2016; you can pre-order here ($17.) By Philip Tetlock and Dan Gardner. 352 Pages. Everyone would benefit from seeing further into the future, whether buying stocks, crafting policy, launching a new product, or simply planning the week's meals. Unfortunately, people tend to be terrible forecasters. As Wharton professor Philip Tetlock showed in a landmark 2005 study, even experts' predictions are only slightly better than chance.

Jonathan Fritz is a Senior Product Manager for Amazon EMR We are excited to launch Amazon EMR release 5.0 today, giving customers the latest versions of 16 supported open-source applications in the big data ecosystem, including new major versions of Spark and Hive. Almost exactly a year ago, we shipped release 4.0, which brought significant improvements to EMR. We based our build and packaging system on Apache Bigtop, moved to standard ports and paths, and streamlined application configuration with configuration objects.

When all is said and done, it won't be the lake itself that delivers the crucial insight that fulfills the ROI, but the people who use it.

With the 2.0 release, the open-source Apache Spark compute engine has entered adolescence. It has consolidated several APIs in the name of simplification, while adding a few for promoting extensibility and improved performance. And by eroding the wall between real-time and batch, it could push streaming into the core of analytics applications.

Has HPE put itself or its software business up for sale? Multiple reports note that a group of private equity companies are interested in buying the software business, or taking the entire company private. What would either deal mean for IT?

Cloud computing has paved the way for programmable infrastructure, which brought extreme automation into software development lifecycle. The ability to provision resources, configuring them on the fly, deploying applications, and monitoring the entire process led to the DevOps culture where developers and the operators are collaborating throughout the application lifecycle. While provisioning and configuration are best left to tools such as Chef, Puppet, and Ansible, one open source software that became the cornerstone of DevOps is Jenkins.

Released with CDH 5.8, Impala 2.6 brings solid performance improvements, particularly for clusters secured by Kerberos running BI workloads on Apache Hadoop. Just a few months back, we showed you how Impala 2.5 delivered a 4x performance boost compared to Impala 2.3 for BI workloads on Hadoop via the introduction of several features like runtime filters. Here's an update: Compared to two releases ago, Impala 2.6 delivers 12x better performance on secure workloads and continues this drumbeat of consistent performance improvement.




Microsoft Data Scientist ROC curves are commonly used to characterize the sensitivity/specificity tradeoffs for a binary classifier. Most machine learning classifiers produce…

As IoT investment grows, with billions of dollars flowing into new enterprises, IT departments, as well as other parts of the business, are expressing concerns over the security risks the technology poses.

User Interface and User Experience are some of the most important aspects of developing a product. No matter how many amazing features something has, a user must be able to access them in order to reap the full benefits of the product. For example, in the Apache Ambari Web UI, add-on apps called Views have,… The post My Summer at Hortonworks – Embedding Views in the Apache Ambari UI appeared first on Hortonworks.

There is lots of confusion about the role of test data in machine learning.  The typical outcome is overfitting, a plague that must be avoided at all reasonable cost.  The confusion comes from blurring two, fundamentally different, roles for test data: Model selection.  Candidate machine learning models are applied to data that was not used to train them.  The model leading to best predictions is selected.  The data used for selecting models is often called validation data. 

Boeing's decision to run its aviation analytics applications on the Azure cloud computing software is a big win for Microsoft, which is chasing Amazon Web Services (AWS) in the high-stakes race to sell computing, storage and other infrastructure software over the internet. The aerospace giant based its choice largely on Microsoft's willingness to help it develop applications to serve its 300 airline customers, which are starved for ways to optimize fuel efficiency and better manage fleets. "

Here's this week's news in Data Science and Big Data. Don't forget to subscribe if you find this useful! Interesting Data Science Articles and News How To Ace A Data Science Interview — Here's what to expect at each stage of a data science interview. What Has Pokemon Got To Do With Big Data? — When there is participation at the levels of Pokemon, many behavioral insights can be gained from this real-time Big Data. How I built a Slack bot to help me find an apartment in San Francisco — Vik Paruchuri walks us through how he built a Slack bot that scrapes housing listings from Craigslist, filters them, posts them on Slack, and deploys it to a server.

This entry was posted in News and tagged , , , , , , , . Bookmark the permalink.