• Follow us


How to build an agile data pipeline

Agility and data are two of the most overused buzzwords of the business community – and for good reason.

Every business wants to be agile, to be responsive to the changing environment, to survive and thrive. Likewise, forward-thinking businesses are majorly focused on data as a route to greater insights, creativity and efficiency. It seems buzzword squared to put these two concepts together, but rather than being a technology to hype, it refers to a smarter way of managing with what enterprises already have, or with readily acquired skills.

An agile data pipeline is what data-centric organisations are putting in place in order to make the best use out of their data investments and ensure that the business can incorporate data-led analytical decision-making in a healthy and sustainable way.

As with any business process, building an agile pipeline involves several stages and should properly encompass a range of appropriate stakeholders within the business. As it is, that’s not always the case as many organisations tend to develop their analytics functions in a higgledy-piggledy manner.

It’s no surprise that the data estate of a business can quickly grow out of control – the four Vs of big data, as defined by IBM are the variety, velocity, volume, and veracity of big data and show that data is no monolithic thing. It’s a living, changing entity. So fluid in fact, that in 2017 Experian built on this format and added two more Vs: Vulnerability and value.

So how do you corral and harness the bucking bronco of data and put it behind the corporate plough, to turn up the nuggets of true insight?

A data catalogue makes storing, finding and using data a much more seamless experience. It’s an organised solution that allows business users to explore data sources and understand them. It saves the user time and can stop them recreating new data if they might have failed to find what they wanted in a non-catalogued state. It’s a great resource to keep the analytical process ticking over at speed, without slowing down the work of data scientists or ‘line of business’ analysts.

A faultless data catalogue doesn’t arrive fully formed, and the history of data governance integrations is littered with solutions that have failed to achieve a critical adoption in an organisation. To truly deliver on a data catalogue the business must also focus on the people and the process, not just the technology. Analytic leaders must build a culture that enables users to succeed with data.

Discover together

Data discovery can be fun, but it’s a hygiene factor that the analyst needs to get through before they can do the job they want to: Analytics, insights, and adding value to the business. Really, the organisation wants to unite all of the data workers with the data and analytic assets they could possibly (but legitimately!) need in a controlled and secure way. It’s important to take steps to make data both searchable and trackable. A platform will offer this and event data lineage, offering more visibility for better governance. When data discovery and data security are breathtakingly easy, there’s no room for data governance missteps. It’s a great first step before an enterprise can create a culture of collaboration, sharing, and innovation by extending formally tribal knowledge across the organisation.

Culture the data culture

The data catalogue is the starting point for most analytical activities. Searching and finding content, understanding context and gaining trust in the results through community feedback and interaction – it’s a great resource when it’s used correctly, saving time and energy, and greatly aiding productivity.

The success of the catalogue is tied into the success of the organisation. Track and reward the most active contributors who add value to the analytic process, understand the assets that are creating the most impactful results, and promote those users to ensure that information assets are well curated and maintained.

The right data culture is socially engaging. It empowers users to impart and share knowledge, and is supported by technology that supports the different ways that users bring their experience together to solve problems. This includes creating and annotating definitions, discussing quality and purpose in conversation threads, and even simple social gestures like sharing a link or giving a 'thumbs up' reinforce the value of the underlying asset and make it richer and easier to find for future users.

Collaborate or die!

It might be that during the course of the pre-data-focused days others in the organisation have already collected the same information or performed a similar analysis, but different analysts have no good way of finding it. Data assets and resulting information proliferate, thus compounding the problem and creating inefficiencies and delays in answering critical business questions.

Taking a cue from social media and wiki techniques, social interactions can help users share and utilise organisational tribal knowledge easily. And everything in the analytic process: Data, analytic apps, workflows, macros, visualisations, and dashboards, should be sharable. When everything is seamlessly shareable and it is fast work to identify trusted information assets as well as insights into how they are used and lineage, it’s very simple to make more impactful business decisions.

One of the most important pieces to this is closing the gap not only around finding the right data but around the roles within an organisation: Between IT, business analysts, data scientists, everyday ‘citizen data scientists’, and onwards to all who use data. Sharing across an organisation is the grease to the wheels of innovation.

Define the best working practices

From the moment you embark on analytics project you stand at a base camp with the peak of expectations staring at you from across the chasm of ignorance. Building a social repository of all the organisation's data sources, reports, workflows, terminology, and more (potentially thousands of lifetimes of accumulated knowledge) is as daunting as climbing Mount Everest. So, don't.

Start small, but think big. Tackle smaller challenges to get some early victories and build momentum from there.

Pick a single department or project. Perhaps start with a handful of critical datasetsDocument expertise while reports and data sources are being created, before the skills and the knowledge leaves the project (or the company!) Ensure that new people can understand the function of dashboards, reports other datasetsFollow your business strategy: Document and socialise the assets associated with key strategic projects, and use the catalogue as a means to change the culture towards greater collaborationTo ensure adoption, it’s vital that users find the information always up-to-date. Without timeliness, the catalogue immediately loses trust and credibility and the pipeline starts to leakA business glossary is a critical component of your data strategy. A glossary can take many forms: definitions, concepts, subject areas, etc. It captures the unique language of your organisation in a central location, and then connects that meaning with the contents of the catalogueA proper analytics pipeline lives-or-dies on whether users find value in the information within. There is no-one central to the organisation, not even BI and IT teams that have a 100 per cent understanding of all those data sources, data sets, and reports and other types of assets. This expertise and 'know-how' is in the heads of staff: Business teams, analysts, knowledge workers, analytics groups, and more. It's pervasive and waiting to be harnessedTrusting data

It’s one thing to have data, it’s another to trust it and use it properly. Famously executives relied on their experience, their ‘gut’, when making decisions, and sometimes, that’s not a necessarily a bad idea. Where data is not cleaned, rated and trusted, it might not be worth the time to review. But where the right steps are in place the data can tell a very honest and trustworthy story. It is a better resource than the thoughts and opinions of an executive who may not have access to all the facts, the long-term trends, or the powerful analytical ability to correlate all their contents appropriately.

So to stock the data pipeline put in place some simple best practices, encourage your people with good processes and give them the technology that makes this all easy. We’re not in the days of needing to know how code to operate analytical tools, and end-to-end platforms take out the sting of finding, moving, prepping and using data. In fact, stocking the analytics pipeline should be a breeze, exhilarating, process, the opening stages in a virtuoso performance by a data maestro.

Nick Jewell, Director of Product Strategy, AlteryxImage source: Shutterstock/alexskopje

Read More

Leave A Comment

More News

TechRadar: Internet news

The cheapest Xbox One bundle deals and sale 2019-02-08 13:03:00We've listed the UK's hottest Xbox One deals, including the best value bundles with games.

Google makes Chrome bug detection tool open-source 2019-02-08 12:56:25Google has made its fuzzing tool ClusterFuzz open-source so that developers can easily find bugs in their software.

Valentine's Day 2019: the best online flower delivery 2019-02-08 12:38:47The best online flower delivery services for Valentine's Day from Amazon, ProFlowers and more.

The best Valentine's Day flowers online delivery services: 2019-02-08 12:37:37We've tracked down the best places to buy flowers online for Valentine's Day

SSD vs HDD: which is best for your 2019-02-08 12:35:32What are the key differences between hard drives and solid state drives? We look at the pros and cons of both.

The best cheap TV sales and 4K TV 2019-02-08 12:33:09We've searched through the latest TV sales to compare prices and bring you the finest selection of cheap TV deals.

Get 10GB of data for only £5 a 2019-02-08 12:18:47Incredibly cheap SIM only deals thanks to massive cashback!

Carbonite acquires Webroot for $618m 2019-02-08 12:06:31Carbonite is acquiring Webroot to bring endpoint security with built-in cloud backup to its users.

Best 55-inch 4K TVs 2019: the best medium-sized 2019-02-08 11:42:00From 4K UHD and advanced displays to Smart TV capability, the best 55-inch TVs have it all.

Best 65-inch 4K TVs 2019: the best big 2019-02-08 11:30:28If you want to upgrade your home entertainment system, go big with one of the best 65-inch 4K TVs.

Is it worth getting insurance for my iPhone? 2019-02-08 11:22:07If you want to keep your iPhone crack-free and in your pocket long-term you might want to consider insurance.

Building trust in open source: a look inside 2019-02-08 11:00:51OpenChain's Shane Coughlan explains how the Linux Foundation is working to boost open source adoption.

Latest ITProPortal news

What is Big Data? Everything you need to 2019-02-07 10:32:16Big Data: What’s New  05/02 - FEATURE - New year’s resolutions for business looking to leverage big data in 2019 - We’ve spoken

Huawei defends actions, calls for time in UK 2019-02-07 08:00:54The company says it has never had a serious incident in almost two decades of international business.

Apple regains spot as top US tech company 2019-02-07 07:30:59iPhone maker retakes top spot from Microsoft.

Raspberry Pi opens first high-street store 2019-02-07 07:00:17Store aims to attract “customers who were curious about the brand”.

Google blocks 100m Gmail spam emails with TensorFlow 2019-02-07 06:30:19Finding extra 100m spam emails is quite a feat, Google says.

Microsoft joins OpenChain platform 2019-02-07 06:00:42Open sourced solutions are great for businesses, but many fear possible issues with governance.

Human voice: the next generation of data 2019-02-07 06:00:33Voice data is much harder to secure, deliver and analyse than ‘traditional data’.

2019 – The year of automation 2019-02-07 05:30:41The ongoing data generation, gathering and analysis is the fuel behind digital transformation, but if data is the fuel, and digital transformation is

Treading a digital path in 2019 2019-02-07 05:00:16Here are five key trends that Cognizant expects to emerge in the year ahead, as technologies continue to mature and become mainstream.

How to build an agile data pipeline 2019-02-07 05:00:05Agility and data are two of the most overused buzzwords of the business community – and for good reason.

Top 10 personal technologies to support digital business 2019-02-07 04:30:41Here are the 10 most effective technologies that technology leaders should begin to incorporate into their roadmaps and strategies.

Nine steps to building a business-oriented disaster recovery 2019-02-07 04:00:15The following steps will help you organise your thoughts, ask the right questions and develop a strategy for your DR plan that is closely aligned with

Nouvelles Internet

Une entreprise chinoise a détourné des données internet 2019-01-31 18:13:00Une entreprise de télécommunication chinoise a secrètement détourné le trafic internet canadien vers la Chine, a ac

L'UE somme les géants d'internet d'en faire plus 2019-01-29 08:13:00La Commission européenne a exhorté mardi Facebook, Google et Twitter à « intensifier leurs efforts » contre

La Terre ne pourra plus être plate sur 2019-01-25 15:29:00YouTube a décidé de faire la chasse aux vidéos qui font la promotion de théories extravagantes ou conspirationnistes sur l

Protection des données: plus de 95 000 plaintes dans 2019-01-25 08:12:00Plus de 95 000 plaintes ont été déposées dans les pays de l'UE depuis l'entrée en vigueur en mai 2018 d

Le moteur de recherche Bing inaccessible en Chine 2019-01-24 07:48:00Le moteur de recherche du géant informatique américain Microsoft, Bing, était jeudi inaccessible en Chine, des internautes s'inq

Le pape François met en garde contre la 2019-01-24 07:19:00Le pape François a mis en garde jeudi contre « la désinformation » et « la distorsion consciente et cib

Amazon dévoile Scout, son petit robot-livreur 2019-01-23 15:52:00Amazon a commencé à tester dans les environs de Seattle, où est basé son siège, la livraison de colis par un petit

Internet est en panne aux îles Tonga 2019-01-23 08:14:00Les résidants de l'archipel des Tonga pourraient devoir se priver de Facebook et YouTube pendant un moment, après une défaillanc

En Chine, des personnes âgées folles de diffusion 2019-01-18 08:15:00L'internet n'est pas réservé qu'aux jeunes. En Chine, où les personnes âgées aiment se regrouper sur les places p

YouTube veut mieux lutter contre les défis dangereux 2019-01-15 21:49:00La plateforme de vidéos YouTube, très populaire chez les adolescents, a décidé mardi de clarifier ses règles pour m

Dev Pro

IBM Invests $2 Billion in New York Research 2019-02-07 17:56:00International Business Machines Corp., based in Armonk, New York, has been pushing into fast-growing new technologies, like AI, cloud-computing platfo

Facebook's Model Attacked by German Antitrust Regulator 2019-02-07 17:26:00Facebook Inc.’s advertising model came under attack in a landmark ruling from German antitrust regulators who ordered the social network to over

Assessment of Gartner’s Market Guide for Cloud Workload 2019-02-07 17:07:00Learn about the core capabilities in Gartner’s Market Guide for cloud workload protection platforms.

Supplementing the Limitations in Office 365 2019-02-07 17:06:00The focus of this whitepaper is to discuss what Office 365 does and does not do, as well as how to supplement its limitations.

The Top Five Myths of Hybrid Cloud Security 2019-02-07 17:05:00Let’s look at the top five myths surrounding hybrid cloud security.

Mapping the Future: Dealing with Pervasive and Persistent 2019-02-07 17:04:00Learn about Trend Micro’s security predictions for 2019.

Leveraging the Agility of DevOps Processes to Secure 2019-02-07 17:03:00Learn how to leverage the agility of DevOps processes to secure hybrid clouds.

Microsoft Aims to Connect Patient Health Records in 2019-02-07 16:44:00Microsoft Corp. is releasing a service to help health-care companies move vast amounts of patient data to its cloud and connect with other related sys

Tape Storage Is 'Still Here' 2019-02-07 01:49:00When it comes to reliability, longevity and cost-effectiveness, don’t count tape storage out. In fact, tape storage is not just "still here,"

Google Warns Data Privacy Changes Could Hurt Its 2019-02-06 22:58:00As consumers and politicians re-evaluate the data-collecting business models of companies like Google and Facebook Inc., the chance of tough regulatio

Apple FaceTime `Privacy Violation' Gets Queries From Lawmakers 2019-02-06 22:29:00Apple Inc. Chief Executive Officer Tim Cook is getting questions from two key U.S. House Democrats about a bug that let users of its FaceTime video-ch

Rancher Now Allows Containerized Apps to Run on 2019-02-06 21:53:00The open source Kubernetes startup Rancher makes public its easy way to deploy containerized applications to multiple clusters.

TechCrunch » Enterprise

Google doubles down on its Asylo confidential computing 2019-02-06 12:00:52Last May, Google introduced Asylo, an open-source framework for confidential computing, a technique favored by many of the big cloud vendors because i

Big companies are not becoming data-driven fast enough 2019-02-06 10:33:34I remember watching MIT professor Andrew McAfee years ago telling stories about the importance of data over gut feeling, whether it was predicting suc

vArmour, a security startup focused on multi-cloud deployments, 2019-02-06 06:35:15As more organizations move to cloud-based IT architectures, a startup that’s helping them secure that data in an efficient way has raised some c

Retail technology platform Relex raises $200M from TCV 2019-02-06 05:17:38Amazon’s formidable presence in the world of retail stems partly from the fact that it’s just not a commerce giant, it’s also a tech

Google’s still not sharing cloud revenue 2019-02-05 17:33:05Google has shared its cloud revenue exactly once over the last several years. Silence tends to lead to speculation to fill the information vacuum. Luc

Backed by Benchmark, Blue Hexagon just raised $31 2019-02-05 09:00:33Nayeem Islam spent nearly 11 years with chipmaker Qualcomm, where he founded its Silicon Valley-based R&D facility, recruited its entire team and

BetterCloud can now manage any SaaS application 2019-02-05 09:00:16BetterCloud began life as a way to provide an operations layer for G Suite. More recently, after a platform overhaul, it began layering on a handful o

Databricks raises $250M at a $2.75B valuation for 2019-02-05 03:01:40Databricks, the company founded by the original team behind the Apache Spark big data analytics engine, today announced that it has raised a $250 mill

Coda’s programmable document editor comes out of beta, 2019-02-05 03:01:19Coda, which is coming out of its limited beta today, wants to reinvent how you think about documents and spreadsheets. That’s about as tough a c

After 5 years, Microsoft CEO Satya Nadella has 2019-02-04 12:46:13Five years ago today, Satya Nadella took over as CEO at Microsoft, and by most any measure has been wildly successful. It’s common to look at th

Workplace messaging platform Slack has confidentially filed to 2019-02-04 12:44:37The company has taken its first official step toward a rumored direct listing.

Chicago RPA startup Catalytic hauls in $30M Series 2019-02-04 09:00:42Robotics process automation (RPA) is as hot as any enterprise technology at the moment, as companies look for ways to marry their legacy systems with

Disclaimer and Notice:WorldProNews.com is not responsible of these news or any information published on this website.