Big Data News – 2 Sep 2015

Top Stories
There are many different team topologies that can be effective for DevOps. Each topology comes with a slightly different culture, and a team topology suitable for one organisation may not be suited to another organisation, even in a similar sector. This article explores the cultural differences between team topologies for DevOps, to help you choose a suitable DevOps topology for your organisation.

The way businesses present their products in the market usually relies to the trend showcased by the mainstream media. Different inventions were being introduced either as a means of advancing humanity or in some cases, possibly a trap — making our lives dependent to those things.

In anticipation of his upcoming conference presentation, The Changing Face of Analytics at Federal Agencies: A View from the IRS at Predictive Analytics World for Government, Oct 13-16, 2015, we asked Jeff Butler, Associate Director of Data Management, IRS Research, Analysis and Statistics organization, a few questions about his work in predictive analytics. Q: How […]


So, your organisation just went through yet another restructure! You notice that the new structure does not look very different to the last one 6 months ago and again not vastly different to how it was 20+ years ago and likely to remain for the foreseeable future, with the exception of Heads that change! Source: By keeping the general organisation design this way the top level management can maintain span of control by establishing boundaries and rules of behaviour to ensure certainty that the organisation's resources are efficiently managed to provide best return on investment. This sort of organisation structure, generally recognised as mechanistic or bureaucratic is commensurate with a view that strategy is formed at the top of the organisation and the rest of organisation is seen as a means of implementing the strategy. While generally not visible in the organisation chart, other forms of design co-exist (e.g. matrix structures) in most organisations, to enable new product development, geographic integration and cross-functional coordination. In this resource-driven paradigm, enterprise information is considered a corporate resource and is centrally managed.

Recently I participated in a webinar panel with IT and marketing leaders on building alignment between the CIO and CMO, and surprisingly every speaker agreed: most of us are there or well on our way. The next frontier is building a collaboration-centric culture across the company, and data is the place you start. Just about every business function now needs to collect, connect and visualize data in order to better measure outcomes and prove results. Now one of the defining factors to sustaining a competitive advantage, data is also the most promising area your company can focus on to build bridges, dissolve siloes and see better business results.

Here is my top 7 list of daft things that some people say about Big Data. I think that Big Data does play a role in some businesses. I also think that some of the basic distributed file store and text search technologies can be usefully employed, in non-traditional indexing, counting and correlation. However, there is an awful lot of nonsense said about Big Data. So, onwards and upwards. Big Data is like currency. If Big Data is currency, and for most of us, it isn't, then it's more like the hyperinflationary money of the Weimar Republic, rather than something you would take to the bank or try and buy the weekly grocery with. Big Data might have value, no doubt some of it does…

The updated Google Nest thermostat can now sense your presence from across the room.

Amazon has introduced a new mobile app monetization model dubbed Amazon Underground and linked with their own Amazon app store. The new model provides "actually free" apps to customers while developers are paid based on how long their apps are used. By Sergio De Simone

For Bloggers Additional Reading Data Scientist Reveals his Growth Hacking Techniques 10 Modern Statistical Concepts Discovered by Data Scientists Top data science keywords on DSC 4 easy steps to becoming a data scientist 13 New Trends in Big Data and Data Science 22 tips for better data science Data Science Compared to 16 Analytic Disciplines How to detect spurious correlations, and how to find the real ones 17 short tutorials all data scientists should read (and practice) 10 types of data scientists 66 job interview questions for data scientists High versus low-level data science

Save your office of finance from crisis by preparing your financial performance management for the future. Learn how to help your finance function reach its full potential by adopting best practices and avoiding pitfalls along the way.

A new tool from SAP will allow companies to analyze distributed Hadoop data alongside corporate data using the ERP giant's Hana in-memory computing platform. Announced on Tuesday, SAP Hana Vora is an in-memory query engine that taps the Apache Spark execution framework to deliver interactive analytics on Hadoop. [ Also on InfoWorld: Harness the power of Hadoop — find out how in InfoWorld's Deep Dive report. | 18 essential Hadoop tools for crunching big data. | Get a digest of the day's top tech stories in the InfoWorld Daily newsletter. ] By extending Hana's reach to include distributed data in the Hadoop ecosystem, the tool is designed to help data scientists and developers combine corporate and external data in their analyses. That, in turn, means incoming data from customers, partners, and smart devices can be integrated with that from internal enterprise processes, giving companies better context with which to make decisions, SAP said.

IBM's recent embrace of Apache Spark is beginning to generate dividends in the form of open source contributions for a mainframe big data link to Spark. Big data software vendor Syncsort, Woodcliff Lake, N.J., said Tuesday (Sept. 1) it is contributing an IBM z System mainframe connector for Apache Spark that would allow easier access to mainframe data using Spark's analytics and Spark SQL. The company described its latest mainframe connector as being similar to the Apache Sqoop link it released as open source software last year. That connector allows Hadoop users to import and analyze data coming from the z System mainframe environment. The new Spark connector is designed to ease specifying the location of multiple datasets and associated metadata.

If you offer a great product at a great price, it stands to reason that your customers will come back for more of the same. But with so many brands competing for eyeballs and wallets (both online and offline), you can't simply sit back and wait for those customers to return. CLOGGS, an award-winning shoe retailer in the United Kingdom, provides shoes for the whole family under the company motto, "Happy feet make for a happy you!" Using one original brick-and-mortar store and an extensive online selection since 1998, CLOGGS offered a product consumers loved, and yet somehow, those customers were getting lost on the way back after their initial purchases.

Illegal drugs. Stolen credit card numbers. Hitmen for hire. These are some of things you can find on the Dark Web, a part of the Internet that's not indexed by traditional search engines and where people browse in complete anonymity. Now a group of security researchers in Maryland is using big data technology to shine a light on some of the Dark Web's seedier neighborhoods. The public Internet can be a creepy place, but it's nothing compared to the Dark Web, where all manner of human depravity is put on display, and some of it is traded as commerce. Not everything on the Dark Web is sinister or illegal, but enough of it is that it's gained the attention of the FBI, which put the Dark Web on the public's map two years when it took down the Silk Road marketplace. Now a big data startup called Terbium Labs is now looking to shed a little light on the Dark Web.

It looks like even hackers take vacations. The month of August only had one significant data breach. Carphone Warehouse — $2.4M exposed — Malicious Outsider Carphone Warehouse is an independent mobile phone retailer, with over 1,700 stores across Europe, discovered that someone had unauthorized access on their network on August 5th and stopped it right away. They believe it the attack happened within the last 2 weeks. 2.4M customers were affected and up to 90,000 of them could have had their encrypted credit card details accessed. The company's investigation found that the data could have included names, addresses, dates of birth and bank details.

by Antony Unwin University of Augsburg, Germany David Moore's definition of data: numbers that have been given a context. Here is some context for the finch dataset: Fig 1: Illustrations of the beaks…

The Angular team has announced a new component, dubbed ng-upgrade, that will allow Angular 1 and Angular 2 projects to coexist. This will allow developers to migrate an application one piece at a time without losing the fidelity of either engine.

See what happens when data scientist David Chudzicki of Kaggle sits down with Jackie Woods, director at UPS, to talk about the intersection of data science and IT at the InformationWeek Conference 2015.

Big data analysis is helping organizations better analyze their customers, predict the competitive landscape and suss out emerging trends before they go mainstream — all of which helps companies maintain a competitive edge. But turn the lens inward, and big data can also be a competitive advantage by helping managers sharpen focus on hiring, retention, compensation and developing top talent. Which candidate should I hire? Who should I promote? Who's going to leave the company within the next few months? Does John or Jane deserve that raise they asked for?

Advanced analytics can boost organizational decision-making and offer multiple benefits including increased revenue and decreased costs. However, achieving these goals requires an emphasis on three strategic pillars: focusing on business decisions, embracing an agile culture and investing in an ecosystem of talent. This analysis looks at how companies can focus on business decisions.

As my colleague Bill Franks recently pointed out on his blog, there is often the perception that being data-driven is all about technology. While technology is indeed important, being data-driven actually spans a lot of different areas, including people, big data processes, access, a data-driven culture and more. In order to be successful with big data and analytics, companies need to fundamentally embed it into their DNA.

Spark has overtaken Hadoop as the most active open source Big Data project. While they are not directly comparable products, they both have many of the same uses. In order to shed some light onto the issue of "Spark versus Hadoop" I thought an article explaining the essential differences and similarities of each might be useful.

There are always two aspects to data quality improvement. Data cleansing is the one-off process of tackling the errors within the database, ensuring retrospective anomalies are automatically located and removed. Another term, data maintenance, describes ongoing correction and verification — the process of continual improvement and regular checks.

Now that big data is in the mix, there's potential for a ton of noise. The business best able to cull what's vital from the deluge of data, turn it into information and communicate it loud and clear will ride the landslide of data to success.

This entry was posted in News. Bookmark the permalink.