Big Data News – 13 May 2016

Today's Infographic Link: Common MythConceptions

Top Stories
MarkLogic9 makes it possible to semantically define specific sets of data within the database that can then be logically invoked by an API.

The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!

You might already know them from theagileadmin.com, but let me introduce you to two of the leading minds in the Rugged DevOps movement: James Wickett and Ernest Mueller. Both James and Ernest are active leaders in the DevOps space, in addition to helping organize events such as DevOpsDays Austinand LASCON. Our conversation covered a lot of bases from the founding of Rugged DevOps to aligning organizational silos to lessons learned from W. Edwards Demings.

The third Internet of Things World Conference took place in Santa Clara, California, USA on May 10-12. The star of the exhibition hall may have been the Australian start-up Buddy's booth, which presented a demo of Buddyville, a smart city built with 13,000 Lego pieces, its power monitored by IoT smart sensors and managed by the Buddy Platform's scalable backend data services technology. By Kevin Farnham

Precise positioning in a store or resort — anywhere with WiFi — leads to fascinating new mobile business apps development and interactive user experience benefits. To learn how precise positioning in a store or resort — anywhere with WiFi — leads to fascinating new mobile business apps development and interactive user experience benefits, we're joined by Alvaro Garcia-Hoz, Founder and General Manager of Mobile Experience in Madrid.

Roughly half of all Web traffic comes from bots and crawlers, and that's costing companies a boatload of money. That's one finding from a report released Thursday by DeviceAtlas, which makes software to help companies detect the devices being used by visitors to their websites. Non-human sources accounted for 48 percent of traffic to the sites analyzed for DeviceAtlas's Q1 Mobile Web Intelligence Report, including legitimate search-engine crawlers as well as automated scrapers and bots generated by hackers, click fraudsters and spammers, the company said.

VANCOUVER, BC — Last year's foundation of the Open Data Platform Initiative (ODPi), a collaborative project of The Linux Foundation that aims to reduce complexity surrounding the Hadoop ecosystem, made waves in certain parts of the Apache Software Foundation (ASF) concerned by the creation of an external organization that could exert influence over Apache projects. At the Apache: Big Data North America conference in Vancouver, BC this week, the ODPi moved to ease those concerns through dialog and sponsorship of the ASF.

Guest blog post by Laetitia Van Cauwenberge According to Wikipedia, MongoDB is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB avoids the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.

Cloud computing changed data analytics for good. It enabled companies to drastically decrease resources and architecture previously assigned with business intelligence departments. It also enabled laymen to run advanced business analytics. Cloud was also the architecture of choice for storing and processing big data. Data piling is a continuous process, which is going to explode with emerging Internet of Things concept. Answer to this issue developers found in new concept called fog computing. As opposed to clouds, fog computing architecture is capable of conducting all required computations and analytics directly at the data source.

Some experts have it that tens of thousands of devices are running buildings, power plants and other infrastructure in the US, and they are all connected to and part of the Internet of things (IoT). This connectivity produces many efficiencies, yet it brings the risk and fear of cybersecurity breaches. Can we find an answer to IoT challenges?

Opera has this new 'power-saving' mode. The eponymous Web browser company claims it gives you (ahem) 'up to' 50% more battery life — but is that likely? Uh, NO! [Developing story: Updated 8:42 am and 2:34 pm PT with Opera response and more comment] Yes, the actual software tweaks will make a difference, but the tests Opera's quoting are skewed, unscientific, and compare apples to oranges. But what do you expect from a company that's trying to get bought by a Chinese consortium for more than $1.2 billion? Here we go again: A few weeks ago it was the VPN that turned out to be a proxy, and now this. Your humble blogwatcher curated these bloggy bits for your entertainment.

Across the rail and freight logistics industries, traditional approaches to asset utilization are shifting to accommodate a data-driven and proactive future where analytics and data insights provide companies with greater returns.

One Big Bang digital transformation was enough for Tom Keiser. After overhauling legacy IT systems and updating ecommerce platforms, the former Gap CIO has joined SaaS provider Zendesk as its first CIO. The move comes more than a year after leaving the apparel retailer. Now Keiser will build out IT, security and data analytics capabilities for a cloud service provider seeking to top $1 billion in revenues by 2020. Zendesk founder and CEO Mikkel Svane told CIO.com via email that he hired Keiser for the wealth of "enterprise technology and operations experience" he brings to a company that is entering its next stage of growth.

Looker, the company that is powering data-driven businesses, announced Looker Data Apps, analytical applications for business units and departments. Looker Data Apps have been developed from years of experience building custom applications for data-leading companies.

When Doug Cutting set out to develop an open source Web search engine in the late 1990s, he initially chose the GPL license to distribute his wares. When that failed, he decided to give the Apache Software Foundation a shot-and in the process may have changed the course of open source software development for the next 20 years. Cutting initially developed the Lucene search engine with the idea of building a business around it, but later decided to give the technology away for free. "I just wanted people to use it. I wanted somebody to get some value from the software," he said during a keynote address at yesterday's Apache: Big Data 2016 conference in Vancouver, British Columbia.

You may be on vacation but hackers are not. Ensure devices and networks are protected while away from the office.

Project management offices (PMOs) must operate in such a way that they demonstrate that they can support and add to the wider organization and its strategic goals.

Corsa DP2000 series is designed primarily for metro and WAN deployments that make use of network virtualization techniques to maximize investments.

Next week, thousands of people in the SAP community will head down to Orlando for SAPPHIRE NOW–SAP's largest customer event of the year. Among the many activities planned for the event, you'll find great sessions and lectures on how SAP envisions predictive analytics. This is truly an important moment in our quest to bring "predictive…

Production demands of the 21st century change at an extraordinary pace. Industrial markets, such as energy and oil & gas face challenges going forward, including the reliable monitoring of assets in the field, dealing with 24-7 production demands, and managing high costs in terms of both time and resources to manage assets in remote locations. These market forces have naturally led to the emergence of the industrial internet of things (IIoT) and wireless communications technology.




A guest blog post from Scott Schlesinger, Principal, Ernst & Young LLP In July 2015, EY announced its EY Warranty Analytics service offering for the SAP HANA® platform. The service includes EY's advanced analytics for use with SAP® technology to monitor warranty claims, with the goals of identifying fraudulent activity, reducing costs and improving quality. Automobile… The post EY, SAP and Hortonworks Joint Solution for Manufacturing Warranty Analytics appeared first on Hortonworks.

By using predictive analytics, providers can use real-time data to see risk factors that previously went undetected. Armed with this information, healthcare systems can then intervene and hopefully change the course of the patient's future health.

AT&T and Sprint announce enhancements to its business services; the strike at Verizon is experiencing is creating problems for its business customers.

Google is preparing a competitor to Amazon's Echo, reportedly called "Chirp," which will put voice-activated Google search into a device that's similar to Google's OnHub WiFi hotspot.

List: If Hadoop isn't quite the right fit for the business use case, then maybe one of these will work better.

Do you want to win the race to insight and beat your competition? If so, it's time to rev up your analytics strategy. Explore how analytics platforms fueled by trusted information, designed for hybrid environments, and built on open technology can put you in the winner's circle.

I'm joined by Wes McKinney and Hadley Wickham on this episode to discuss their joint project Feather. Feather is a file format for storing data frames along with some metadata, to help with interoperability between languages. At the time of recording, libraries are available for R and Python, making it easy for data scientists working in these languages to quickly and effectively share datasets and collaborate.

Thanks to Richard Williamson of Silicon Valley Data Science for allowing us to republish the following post about his sample application based on Apache Spark, Apache Kudu (incubating), and Apache Impala (incubating). Why should your infrastructure maintain a linear growth pattern when your business scales up and down during the day based on natural human cycles? There is an obvious need to maintain a steady baseline infrastructure to keep the lights on for your business.

The World Trademark Review recently published a startling commentary on a study in an article titled,  "We are failing": study reveals $461 billion international trade in counterfeit and pirated goods. The article details the failings of companies when it comes to combatting counterfeiting online. In this post, we hope to cover how harvesting web data… The post How Web Data Harvesting Can Be Used to Combat Counterfeiting appeared first on BrightPlanet.

Developers can now use Google's SyntaxNet natural language parsing system to create apps that understand written text.

In this week's podcast QCon chair Wesley Reisz talks to Matt Ranney who is the Chief Systems Architect at Uber, where he's helping build and scale everything he can. Previously, Matt was a founder and CTO of Voxer, probably the largest and busiest deployment of Node.js. By Wesley Reisz

Being a Java Champion has its perks, and thanks to the generosity of JetBrains, a free license for IntelliJ IDEA is now one of them. The Champions are the latest in the list of groups earning this special JetBrains premium, which also includes approved open source projects, students, and teachers. By Matt Raible

With $20 billion in annual revenue in the United States alone, the gaming industry is a behemoth in the way people spend their free time. Technology continues to evolve that makes gaming even more inviting and social — and the way developers implement big data has plenty to do with the success the industry is seeing.

If you are from the finance, sales, marketing or operations sector, then you must have noticed that massive data is crawling into your everyday life. No doubt it will keep growing with a good reason. The only problem is how you get results out of the data before making any kind of decisions.

Corporate reporting is a topic that virtually any organization deals with as chief executives (both CEO and CFO) expect to receive regular updates on corporate performance. Delivering on this necessity in some cases is a cumbersome exercise — but it doesn't have to be. We identified some key causes of cumbersome corporate reporting.

Pitching itself as the NoSQL database for the enterprise, MarkLogic unveiled version 9 of its platform at its user conference. It adds a host of new security features, plus new data management capabilities to help organizations wrangle structured and unstructured data.

MapR Technologies, Inc., provider of the Converged Data Platform, announced the immediate availability of Apache Spark 1.6.1 on the MapR Converged Data Platform making it the eighth release of the full Spark stack available to MapR customers.

The presentation below is an educational resource that sets the stage for parallel programming with GPUs (graphics processing units) and was sponsored by the Center for Astrophysics and Supercomputing at Swinburne University of Technology. GPUs are becoming quite popular for the implementation of deep learning solutions.

In anticipation of his upcoming conference presentation, Tips and Tricks on Developing High-performance Fuzzy Name Search Engine to Prevent Terrorism Financing at Text Analytics World Chicago, June 21-22, 2016, we asked Emrah Budur, Senior Software Engineer at Garanti Technology, a few questions about his work in text analytics. Q: In your work with text analytics, what…

More than 80 percent of enterprise data is considered "dark," defined as data that is captured and stored–yet never used.

OpenStack in Action is a new release from Manning that aims to introduce readers to the OpenStack platform for cloud computing (IaaS). InfoQ has interviewed V. K. Cody Bumgardner, author of the book.

"We are in the business of building [FILL IN THE BLANK], why would we build an insights platform out ourselves." That sentiment will drive more and more companies to explore the insights services…

Advanced pattern matching features that were originally expected to be present in C# 7 have been recently excluded from the future branch and will not make it into the next version of the language.

This entry was posted in News and tagged , , , , . Bookmark the permalink.