Big Data News – 10 Aug 2016

Today's Infographic Link: My CV/Resume

Featured Article
Hitachi, Ltd. announced the development of a database management system optimized for the high-speed embedded memory in the hardware and technology for high performance parallel data processing in FPGAs.

Top Stories
IT departments take note: The internet of things market is picking up speed, with new technologies like LPWA paving the way for broader adoption. Even though a new study shows that spending on IoT could reach $3 trillion by 2025, security remains a concern.

Conjecture, an opinion or conclusion formed on the basis of incomplete information. All digital transformation initiatives introduce new problems, software bugs, guaranteed network vulnerabilities, new competitors; new business challenges and new stresses. The elimination of all negative consequences and vulnerabilities are impossible, so our focus should be on limiting and containing it, not eliminating it.

This is the second in a two part series of posts pertaining to using some common server storage I/O benchmark tools and workload scripts. View part I here which includes overview, background and information about the tools used and related topics.

The chipmaker says it will use Nervana Systems' expertise in accelerating deep learning algorithms to expand Intel's capabilities in the field of artificial intelligence.

An open source tool for writing queries and modeling data designed for use with the RethinkDB query language is being positioned as an alternative to developing applications using the ReQL query language. Compose, a provider of hosted databases founded in 2010, acquired by IBM (NYSE: IBM) last year and incorporated into its Cloud Data Services unit, is pitching the ReQL alternative dubbed "Thinky."

In anticipation of his upcoming conference keynote presentation, Implementing Predictive Analytics at CMS: Lessons Learned and Future Directions at Predictive Analytics World for Government, October 17-20, 2016, we asked Dr. Shantanu Agrawal, Deputy Administrator for Program Integrity and Director of the Center for Program Integrity at the Centers for Medicare & Medicaid Services (CMS), a… The post Wise Practitioner – Predictive Analytics Interview Series: Dr. Shantanu Agrawal at Centers for Medicare & Medicaid Services appeared first on Predictive Analytics Times.

Hortonworks DataFlow (HDF) 2.0, offers a combination Apache NiFI 1.0, Kafka 0.10 and Storm 1.0. HDF 2.0 has significant architecture and enterprise productivity features to make it faster and easier to deploy, manage and analyze streaming data. In the next few weeks, we will go into more details, but for now, here are the three… The post Three Things To Know About HDF 2.0 appeared first on Hortonworks.

Hortonworks, Inc. ® (NASDAQ: HDP), a leading innovator of open and connected data platforms, announced the next generation of Hortonworks DataFlow (HDF™) version 2.0 for enterprise productivity and streaming analytics.

Every finance leader of a small or midsize business has a to-do list that's a mile long. Controlling growth and communicating performance results to a community of investors consisting of owners, shareholder, banks, insurers, and — last, but not least — employees. Keeping a critical eye on threats and opportunities as the company expands its…

Some of the most important decisions made in Washington D.C. are influenced heavily by sophisticated computer models refined over the course of decades. However, most of the models are proprietary, which hinders the ability to improve them and to weigh underlying assumptions that affect the results. Now, a new initiative dubbed TaxBrain seeks to jumpstart an era of government transparency and openness with a set of new open source models written in Python that anyone can access over the Web.

Box has made no secret of its global ambitions, and on Wednesday it advanced them another step by announcing two new regional "Zones" in Canada and Australia. "Our mission is to build out the most advanced social cloud," said Aaron Levie, cofounder and CEO of the California-based company, in an interview. "We want to make sure we can deliver no matter what your security, compliance or data-residency requirements." .

Aaron Levie, the outspoken founder and CEO of enterprise file sharing and storage powerhouse Box, foresees a time when all enterprise data will head to the cloud, and his company this week is introducing expanded capabilities to speed that transition. Levie talked to Network World this week ahead of the company's news about its Zones and Accelerator projects, and also discussed start-ups, the march of the public cloud, and even his past work as a professional magician.

For the past 30 years, Excel has been integral to businesses everywhere.It's become the foundation of countless business processes, aiding in computing, financial tasks, IT projects, marketing, and so much more. It's no wonder Excel was such a hit from the start.

Because the risk of a data breach is so high, and the potential fallout is so significant, it's time for companies to get serious about big data security.

Wheels turning and forklifts filled–that's one measure of success in any warehouse. If you can increase the amount of product picked up and put away, the more productive and cost efficient you are. For Pittsburgh-based retailer Giant Eagle, the key to making that happen is to operate vision-guided, autonomous vehicles–robots–in its distribution centers. + Also on Network World: How IoT helps transplant surgeons track organ shipments +

You've submitted your favorite tools, you've voted on what's Hot or Not, and finally the wait is over. Today XebiaLabs launched the second version of The Periodic Table of DevOps Tools. Since the launch of The Periodic Table of DevOps v.1 in July of 2015, the table has taken the IT world by storm. Organizing the DevOps tooling landscape had been a major pain point for anyone in the IT industry. The Periodic Table of DevOps Tools has not only alleviated this problem but has become an industry standard as a reference tool for all IT professionals.

C3 IoT is Tom Siebel's latest venture. The young Redwood City, Calif-based company is off to a fast start with industrial IoT applications, and now it's bringing a cross-industry platform to market.

In this special technology white paper, EMC IsilonSD Edge: Software-Defined Scale-Out NAS on Industry Standard Hardware for the Enterprise Edge, you'll learn how the EMC IsilonSD product family combines the power of Isilon scale-out NAS with the economy of software defined storage.

Here are nine tips for helping organizations avoid the political landmines that can so easily damage productivity and camaraderie in the workplace.

How Cambridge Analytica, the data firm used by Ted Cruz and Republican candidates, uses advanced psychographics and over 5 thousand personalized data points to hyper-target advertising messages.

As customer attention spans get ever shorter, and marketplaces ever more crowded and competitive, real-time and near real-time systems are hot button issues for businesses. Unfortunately, the question of what actually constitutes "real-time" is rather more vexing than it first appears. When a merchandiser at a big box retailer talks about "real-time analytics", for example, he may actually want a sales dashboard that is updated several times a day.

Oracle has denied in a California federal court charges leveled by a former manager that she was sacked after she refused to cook accounts in the company's cloud business and threatened to blow the whistle on the accounting practices. The software and cloud computing giant appears to be fleshing out its original stand that the employee had been terminated for poor performance and not as a whistleblower, which would give her a number of protections under securities laws.

Discover the most important thing you can do to improve the odds of capitalizing on cognitive computing.

The promise of an ever-evolving big data landscape and the depths to which analytics can provide companies with new insights, or even new products, means this is an exciting time to be involved in data research. At the same time, however, we're charting new territory that crosses multiple boundaries, including ethical and privacy concerns. This is especially true, of course, with data that contains personally identifiable information (PII) as this risks exposing individuals to everything from tracking through to fraud if the data should get into the wrong hands.

You think you know what's in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of — literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data.

A great way to get media attention is to point out the historical success of your founder. In the case of Engagio, the tantalizing factoid is that its founder, Jon Miller, previously co-founded Marketo, the marketing automation player that has been making waves. The fact that Miller left Marketo to embark on a new venture is interesting.

Over five years, the changes in BYOD are to the types/sophistication of security employed. The constant: Security is, by far, the biggest concern.

Smart Cities are here to stay, but for their promise to be delivered, the data they produce must not be put in new siloes. In his session at Things Expo, Mathias Herberts, Co-founder and CTO of Cityzen Data, will deep dive into best practices that will ensure a successful smart city journey.

Analytics can transform the business for telecommunications and cable providers of all sizes. As cable operators and wireless service providers integrate new channels into their business, importing real-time data into a single, holistic view across all platforms is likely to become an imperative function for these organizations.

Unhealthy work habits are taking their toll in the form of increased absenteeism, lost productivity, and higher insurance costs.

Recent industry research by both Strategy Meets Action (SMA) and Novarica highlights analytics as the top priority for the insurance industry. Further, the Insurers' 2016 Strategic Initiatives: Advancing Industry Transformation report by SMA identified customer engagement as another top priority for insurers. Success in the insurance industry depends on your company's ability to quickly interact… The post Customer 360: Common Pitfalls to Avoid for a Successful Implementation appeared first on Hortonworks.

Storage is the most popular cloud service, but low cost and flexible scalability are giving way to practical concerns: reliability, ease-of-migration.

According to analysts from Gartner and elsewhere, every enterprise with a significant cloud presence needs a cloud access security broker (CASB) to protect its cloud-based data. CASB products can sit either on-premises or live in the cloud, but they all have the same basic function — providing a secure gateway for data traveling to and from the cloud, particularly with respect to SaaS applications and common cloud storage services like Box or Dropbox.

The Internet of Things (IoT) is generating new data streams that make it possible to track formerly immeasurable processes and react more quickly to changing business conditions. Experts estimate close to 30 billion connected IoT devices by 2020 in areas as diverse as smart cities, connected cars, construction, medical devices, logistics and industrial equipment.

This is an unusual blog for me. Usually I talk about how organizations can more effectively leverage data and analytics to power their business. However, as I conduct more Big Data Vision Workshops, I have come to realize that a big part of the success of these engagements is the ability to "listen and comprehend." Here are some observations and tips for "listening and comprehending" more effectively. I've classified this as "facilitation" because I seek to "facilitate" a dialogue with the client where I can learn enough about the client's business to help them build the right Big Data business strategy.

How can businesses continue to respect privacy concerns while still permitting the use of big data to drive business value? 'Companies will now have an even greater obligation to protect the personal information entrusted to them, no matter how it's processed' Big data use is expected to grow exponentially in the next few years now… The post Big Data vs. Privacy: A Balancing Act appeared first on Predictive Analytics Times.

Whether you suffer from a diagnosed anxiety disorder or not, many of us who are responsible for deployments become uneasy when deploying code to production. Did my tests catch everything? What if something happens during a migration and I can't rollback? Will that small code change create an unstoppable reaction destroying everything in my database?

Digital marketing is more than just an easier, cheaper alternative to big, high budget ad campaigns.

Forging a viable business technology strategy for today's global networked economy is a high priority for most forward-thinking CEOs across the globe. Their guidance to CIOs is to create the fusion between existing IT infrastructure and modern cloud services. Moreover, the shift to a Hybrid IT model must support the organization's key commercial expansion objectives. The savvy leaders who have a superior approach can extract greater value from their legacy IT investments, and launch new initiatives based upon public and private cloud computing services.

Hernan Vivani is an Hadoop Systems Engineer for Amazon Web Services When you launch a cluster, Amazon EMR lets you choose applications that will run on your cluster. But what if you want to deploy your own custom application? This post shows you how to build a custom application for EMR for Apache Bigtop-based releases 4.x and greater. EMR nodes are based on the Amazon Linux AMI, so I will deploy on RPM packages and use Elasticsearch as the example application. What is Apache Bigtop?

This article is the second installment in a three part series that covers one of the most critical issues facing the financial industry — Investor & Market Integrity Protection via Global Market Surveillance. While the first (and previous) post discussed the global scope of the problem across multiple global jurisdictions —  this post will discuss a candidate Big Data & Cloud… The post Ensuring Market Integrity & Investor Protection via Trade Surveillance – Part 2 of 3 appeared first on Hortonworks.

There's plenty of lip service paid today to the importance of customer service, but a new study suggests brands are failing miserably at delivering it. Social media, it turns out, isn't making things any better. In any given year, more than 80 percent of consumers try to reach a brand, and for most of them, it's an exercise in frustration, according to new data from The Northridge Group. Fifty-five percent say they need to use two or more communication channels to contact a company or brand before an issue is resolved.

News this morning from Google, which is announcing some price changes for some of its cloud computing products. For a little background here, Google is widely regarded as being in third place in the public cloud infrastructure wars, trailing a long way back from Amazon Web Services (AWS), the front-runner, and Microsoft Azure in second place. Indeed, the most recent Gartner analysis about the public cloud space had a few surprises (in particular the absolutely abysmal scores for both IBM and Oracle) — one of the surprises was that Google has lost a little ground to both Microsoft and AWS, with the former seemingly moving closer to AWS over the past year.

In previous videos, we've talked about how we harvest and curate web data. In this short, two-minute video, we explain the final step in the process, developing insights. The develop insights stage looks different for each individual customer. Learn why and what options are available in the below video. Develop Insights: Data Visualization Data visualizations…

Looking for Relevance: I am sure that in the heart of most governance, risk, and compliance (GRC) professionals is a quest for relevance. Most senior GRC professionals have at least some reporting requirement to the C-Suite or the board or both. But the struggle is, "What do I say and how do I link it…

by Anusua Trivedi, Microsoft Data Scientist Background and Approach This blog series is based on my upcoming talk on re-usability of Deep Learning Models at the Hadoop+Strata World Conference in…

The latest Ponemon study suggests the number and depth of attacks have significantly increased.

Advances in memory technologies and event-driven architectures based on microservices have the potential to radically transform enterprise computing.

    According to Forrester, the level of analytic satisfaction within organizations is on the decline.* The traditional multiple-step, multi-tool legacy approach is a slow, time-consuming, and in most cases, a costly process that prevents organizations from making faster decisions with confidence. Data analysts today need an agile solution that empowers them to take charge of the entire analytics process.  

MapR Technologies, a startup in the open source big data space, recently raised $50 million and may be gearing up for an IPO, which points to growth in the market.

Learn how the performance advantages of the Crypto cryptographic library will provide an upgrade for Spark shuffle encryption over the current approach. When running a big data computing job, the data being processed may contain sensitive information that users don't want anyone else to access. Encrypting that sensitive data is becoming more and more important, especially for enterprise users. For Apache Spark, which is the emerging standard for big data processing, The post Securing Apache Spark Shuffle using Apache Commons Crypto appeared first on Cloudera Engineering Blog.

Bookings increase by 100%

Streaming Analytics is a data processing paradigm which is gaining much traction lately, mainly because more and more data is available as events through web services and real-time sources rather than being collected and packaged in data batches.

This entry was posted in News and tagged , , , , , , , , , , , , , . Bookmark the permalink.