Jill Dyche on 2012

In part 3 of the series for predictions for 2012, here is Jill Dyche, Baseline Consulting/DataFlux.

Part 2 was Timo Elliot, SAP at http://www.decisionstats.com/timo-elliott-on-2012/ and Part 1 was Jim Kobielus, Forrester at http://www.decisionstats.com/jim-kobielus-on-2012/

Ajay: What are the top trends you saw happening in 2011?

 

Well, I hate to say I saw them coming, but I did. A lot of managers committed some pretty predictable mistakes in 2011. Here are a few we witnessed in 2011 live and up close:

 

1.       In the spirit of “size matters,” data warehouse teams continued to trumpet the volumes of stored data on their enterprise data warehouses. But a peek under the covers of these warehouses reveals that the data isn’t integrated. Essentially this means a variety of heterogeneous virtual data marts co-located on a single server. Neat. Big. Maybe even worthy of a magazine article about how many petabytes you’ve got. But it’s not efficient, and hardly the example of data standardization and re-use that everyone expects from analytical platforms these days.

 

2.       Development teams still didn’t factor data integration and provisioning into their project plans in 2011. So we saw multiple projects spawn duplicate efforts around data profiling, cleansing, and standardization, not to mention conflicting policies and business rules for the same information. Bummer, since IT managers should know better by now. The problem is that no one owns the problem. Which brings me to the next mistake…

 

3.       No one’s accountable for data governance. Yeah, there’s a council. And they meet. And they talk. Sometimes there’s lunch. And then nothing happens because no one’s really rewarded—or penalized for that matter—on data quality improvements or new policies. And so the reports spewing from the data mart are still fraught and no one trusts the resulting decisions.

 

But all is not lost since we’re seeing some encouraging signs already in 2012. And yes, I’d classify some of them as bona-fide trends.

 

Ajay: What are some of those trends?

 

Job descriptions for data stewards, data architects, Chief Data Officers, and other information-enabling roles are becoming crisper, and the KPIs for these roles are becoming more specific. Data management organizations are being divorced from specific lines of business and from IT, becoming specialty organizations—okay, COEs if you must—in their own rights. The value proposition for master data management now includes not just the reconciliation of heterogeneous data elements but the support of key business strategies. And C-level executives are holding the data people accountable for improving speed to market and driving down costs—not just delivering cleaner data. In short, data is becoming a business enabler. Which, I have to just say editorially, is better late than never!

 

Ajay: Anything surprise you, Jill?

 

I have to say that Obama mentioning data management in his State of the Union speech was an unexpected but pretty powerful endorsement of the importance of information in both the private and public sector.

 

I’m also sort of surprised that data governance isn’t being driven more frequently by the need for internal and external privacy policies. Our clients are constantly asking us about how to tightly-couple privacy policies into their applications and data sources. The need to protect PCI data and other highly-sensitive data elements has made executives twitchy. But they’re still not linking that need to data governance.

 

I should also mention that I’ve been impressed with the people who call me who’ve had their “aha!” moment and realize that data transcends analytic systems. It’s operational, it’s pervasive, and it’s dynamic. I figured this epiphany would happen in a few years once data quality tools became a commodity (they’re far from it). But it’s happening now. And that’s good for all types of businesses.

 

About-

Jill Dyché has written three books and numerous articles on the business value of information technology. She advises clients and executive teams on leveraging technology and information to enable strategic business initiatives. Last year her company Baseline Consulting was acquired by DataFlux Corporation, where she is currently Vice President of Thought Leadership. Find her blog posts on www.dataroundtable.com.

Interview Scott Gidley CTO and Founder, DataFlux

Here is an interview with Scott Gidley, CTO and co-founder of leading data quality ccompany DataFlux . DataFlux is a part of SAS Institute and in 2011 acquired Baseline Consulting besides launching the latest version of their Master Data Management  product. [Read more...]

Short Interview Jill Dyche

Here is brief one question interview with Jill Dyche , founder Baseline Consulting.

 

In 2010.

 

  • It was more about consciousness-raising in the executive suite—
  • getting C-level managers to understand the ongoing value proposition of BI,
  • why MDM isn’t their father’s database, and
  • how data governance can pay for itself over time.
  • Some companies succeeded with these consciousness-raising efforts. Some didn’t.

 

But three big ones in 2011 would be:

  1. Predictive analytics in the cloud. The technology is now ready, and so is the market—and that includes SMB companies.
  2. Enterprise search being baked into (commoditized) BI software tools. (The proliferation of static reports is SO 2006!)
  3. Data governance will begin paying dividends. Until now it was all about common policies for data. In 2011, it will be about ROI.

I do a “Predictions for the coming year” article every January for TDWI,

Note- Jill ‘s January TDWI article seems worth waiting for in this case.

About-

Source-http://www.baseline-consulting.com/pages/page.asp?page_id=49125

Partner and Co-Founder

Jill Dyché is a partner and co-founder of Baseline Consulting.  She is responsible for key client strategies and market analysis in the areas of data governance, business intelligence, master data management, and customer relationship management. 

Jill counsels boards of directors on the strategic importance of their information investments.

Author

Jill is the author of three books on the business value of IT. Jill’s first book, e-Data (Addison Wesley, 2000) has been published in eight languages. She is a contributor to Impossible Data Warehouse Situations: Solutions from the Experts (Addison Wesley, 2002), and her book, The CRM Handbook (Addison Wesley, 2002), is the bestseller on the topic. 

Jill’s work has been featured in major publications such as Computerworld, Information Week, CIO Magazine, the Wall Street Journal, the Chicago Tribune and Newsweek.com. Jill’s latest book, Customer Data Integration (John Wiley and Sons, 2006) was co-authored with Baseline partner Evan Levy, and shows the business breakthroughs achieved with integrated customer data.

Industry Expert

Jill is a featured speaker at industry conferences, university programs, and vendor events. She serves as a judge for several IT best practice awards. She is a member of the Society of Information Managementand Women in Technology, a faculty member of TDWI, and serves as a co-chair for the MDM Insight conference. Jill is a columnist for DM Review, and a blogger for BeyeNETWORK and Baseline Consulting.

 

Interview Dylan Jones DataQualityPro.com

Here is an interview with Dylan Jones the founder/editor of Dataqualitypro.com , the site to go to for anything related to Data Quality discussions. Dylan is a great charming person and in this interview talks candidly on his views.Dylan Jones

Ajay: Describe your career in science and in business intelligence. How would you convince young students to take more maths and science courses for scientific careers.

Dylan: My main education for the profession was a degree in Information Technology and Software Development. No surprises what my first job entailed – software development for an IT company!

That role took me straight into the trials and tribulations of business intelligence and data quality. After a couple of years I went freelance and have pretty much worked for myself ever since. There has been a constant thread of data quality, business intelligence and data migration throughout my career which culminated in me setting up the more recent social media initiatives to try and pull professionals together in this space.

In all honesty, I’m probably the worst person to give career advice Ajay as I’m a hopeless dreamer. I’ve never really structured my career. I fell into data quality early on and it has led me to work in some wonderful places and with some great people, largely by accident and fate.

I have a simple philosophy, do what you love doing. I’m incredibly lucky to wake up every day with an absolute passion for what I do. In the past, whenever I have found myself working in a situation that I find soul destroying (and in our profession that can happen regularly) I move on to something new.

So, my advice for people starting out would be to first question what makes them happy in life. Don’t simply follow the herd. The internet has totally transformed the rules of the game in terms of finding an outlet for your skills so follow your heart, not conventional wisdom.

That said, I think there are some core skills that will always provide a springboard. Maths is obviously one of those skills that can open many doors but I would also advise people to learn about marketing, sales and other business fundamentals. From a business intelligence perspective it really adds an attractive dimension to your skills if you can link technical ability with a deeper understanding of how businesses operate.

Ajay You are a top expert and publisher on BI topics. Tell us something about

a) http://www.datamigrationpro.com/

b) http://www.dataqualitypro.com/

c) Involvement with the DataFlux community of experts

d) Your latest venture http://www.dqvote.com

Dylan- Data Migration Pro was my first foray into the social media space. I realised that very few people were talking about the challenges and techniques of data migration. On average, large organisations implement around 4 migration projects a year and most end in failure. A lot of this is due to a lack of awareness. Having worked for so long in this space I felt it was time to create a social media site to bring the wider community together. So we now have forums, regular articles, tools and techniques on the site with about 1400 members worldwide plus lots of plans in the pipeline for 2010.

Data Quality Pro followed on from the success of Data Migration Pro and our speed of growth really demonstrates how important data quality is right now. Again, awareness of the basic techniques and best-practices is key. I think many organisations are really starting to recognise the importance of better data quality management practices so a lot of our focus is on giving people practical advice and tools to get started. We are a community publishing platform, I do write regularly but we’ve always had a significant community contribution from expert practitioners and authors.

I didn’t just want to take a corporate viewpoint with these communities. As a result they are very much focused on the individual. That is why we post so many features on how to promote your skills, search for work, gain personal skills and generally get ahead in the profession. Data Quality Pro has just under 2,000 members and about 6,000 regular visitors a month so it demonstrates just how many people are really committed to learning about this discipline as it impacts practically every part of the business. I also think it is an excellent career choice as so many projects are dependent on good quality data there will always be demand.

The DataFlux community of experts is a great resource that I’ve actually admired for some time. I am a big fan of Jill Dyche who used to write on the community and of course there is a great line-up on there now with experts like David Loshin, Joyce Norris-Montanari and Mike Ferguson so I was delighted to be invited to participate. DataFlux have sponsored our sites from the very beginning and without their support we wouldn’t have grown to our current size. So although I’m vendor independent, it’s great to be sharing my thoughts and ideas with people who visit their site.

DQVote.com is a relatively new initiative. I noticed that there was some great data quality content being linked through platforms like Twitter but it would essentially become hard to find after several days. Also, there was no way for the community to vote on what content they found especially useful. DQVote.com allows people to promote their own content but also to vote and share other useful data quality articles, blogs, presentations, videos, tutorials – anything that adds value to the data quality community. It is also a great springboard for emerging data quality bloggers and publishers of useful content.

Ajay- Do you think BI projects can be more successful if we reward data entry people, or at least pay more for better quality data rather than ask them to fill in database tables as fast as they can? Especially in offshore call centres.

Dylan- Data entry is a pet frustration of mine. I regularly visit companies who are investing hundreds of thousands of pounds in data quality technology and consultants but nothing in grass-roots education and cultural change. They would rather create cleansing factories than resolve the issues at source.

So, yes I completely agree, the reward system has to change. I personally suffer from this all the time – call centre staff record incorrect or incomplete information about my service or account and it leads to billing errors, service problems, annoyance and eventually lost business. Call centre staff are not to blame, they are simply rewarded on the volume of customer service calls they can make, they are not encouraged to enter good quality data. The fault ultimately lies with the corporations that use these services and I don’t think offshore or onshore makes a difference. I’ve witnessed terrible data quality in-house also. The key is to have service level agreements on what quality of data is acceptable. I also think a reward structure as opposed to a penalty structure can be a much more progressive way of improving the quality of call-centre data.

Ajay- What are the top 5 things that you can help summarize your views on Business Intelligence – assume you are speaking to a class of freshmen statisticians.

Dylan- Business intelligence is wholly dependent on data quality. Accessibility, timeliness, accuracy, completeness, duplication – data quality dimensions like these can dramatically change the value of business intelligence to the organisation. Take nothing for granted with data, assume nothing. I have never, ever, assessed a dataset in a large business that did not have some serious data defects that were impacting decision making.

As statisticians, they therefore possess the tools to help organisations discover and measure these defects. They can find ways to continuously improve and ensure that future decisions are based on reliable data.

I would also add that business intelligence is not just about technology, it is about interpreting data to determine trends that will enable a company to improve their competitive advantage. Statistics are important but freshmen must also understand how organisations really create value for their customers.

My advice is to therefore step away from the tools and learn how the business operates on the ground. Really listen to workers and customers as they can bring the data to life. You will be able to create far more accurate dashboards and reports of where the issues and opportunities lie within a business if you immerse yourself with the people who create the data and the senior management who depend on the quality of your business intelligence platforms.

Ajay- Which software have you personally coded or implemented. Which one did you like the best and why?

Dylan- I’ve used most of the BI and DQ tools out there, all have strengths and weaknesses so it is very subjective. I have my favourites but I try to remain vendor neutral so I’ll have to gracefully decline on this one Ajay!

However, I did build a data profiling and data quality assessment tool several years ago. To be honest, that is the tool I like best because it had a range of features I still haven’t seen implemented so far in any other tools. If I ever get chance, and if no other vendor comes up with the same concept, I may yet take it to market. For now though, two young kids, two communities and a 12 hour day mean it is something of pipedream.

Ajay-What does Dylan Jones do when not helping data quality of the world go better.

Dylan- I’ve recently had another baby boy so kids take up most of whatever free time I have left. When we do get a break though I like to head to my home town and just hang out on the beach or go up into the mountains. I love travelling and as I effectively work completely online now, we’re really trying to figure out a way of combining travel and work.

Biography-

Dylan Jones is the founder and editor of Data Quality Pro and Data Migration Pro, the leading online expert community resources. Since the early nineties he has been helping large organisations tackle major information management challenges. He now devotes his time to fostering greater awareness, community and education in the fields of data quality and data migration via the use of social media channels. Dylan can be contacted via his profile page at http://www.dataqualitypro.com/data-quality-dylan-jones/ or at http://www.twitter.com/dataqualitypro

Interview Jim Harris Data Quality Expert OCDQ Blog

Here is an interview with one of the chief evangelists to data quality in the field of Business Intelligence, Jim Harris who has a renowned blog at http://www.ocdqblog.com/. I asked Jim about his experiences in the field on data quality messing up big budget BI projects, and some tips and methodologies to avoid them.

No one likes to feel blamed for causing or failing to fix the data quality problems- Jim Harris, Data Quality Expert.

Jim Harris Large Photo

Ajay- Why the name OCDQ? What drives your passion for data quality? Name any anecdotes where bad data quality really messed up a big BI project.

Jim Harris - Ever since I was a child, I have had an obsessive-compulsive personality. If you asked my professional colleagues to describe my work ethic, many would immediately respond: “Jim is obsessive-compulsive about data quality…but in a good way!” Therefore, when evaluating the short list of what to name my blog, it was not surprising to anyone that Obsessive-Compulsive Data Quality (OCDQ) was what I chose.

On a project for a financial services company, a critical data source was applications received by mail or phone for a variety of insurance products. These applications were manually entered by data entry clerks. Social security number was a required field and the data entry application had been designed to only allow valid values. Therefore, no one was concerned about the data quality of this field – it had to be populated and only valid values were accepted.

When a report was generated to estimate how many customers were interested in multiple insurance products by looking at the count of applications per social security number, it appeared as if a small number of customers were interested in not only every insurance product the company offered, but also thousands of policies within the same product type. More confusion was introduced when the report added the customer name field, which showed that this small number of highly interested customers had hundreds of different names. The problem was finally traced back to data entry.

Many insurance applications were received without a social security number. The data entry clerks were compensated, in part, based on the number of applications they entered per hour. In order to process the incomplete applications, the data entry clerks entered their own social security number.

On a project for a telecommunications company, multiple data sources were being consolidated into a new billing system. Concerns about postal address quality required the use of validation software to cleanse the billing address. No one was concerned about the telephone number field – after all, how could a telecommunications company have a data quality problem with telephone number?

However, when reports were run against the new billing system, a high percentage of records had a missing telephone number. The problem was that many of the data sources originated from legacy systems that only recently added a telephone number field. Previously, the telephone number was entered into the last line of the billing address.

New records entered into these legacy systems did start using the telephone number field, but the older records already in the system were not updated. During the consolidation process, the telephone number field was mapped directly from source to target and the postal validation software deleted the telephone number from the cleansed billing address.

Ajay- Data Quality – Garbage in, Garbage out for a project. What percentage of a BI project do you think gets allocated to input data quality? What percentage of final output is affected by the normalized errors?

Jim Harris- I know that Gartner has reported that 25% of critical data within large businesses is somehow inaccurate or incomplete and that 50% of implementations fail due to a lack of attention to data quality issues.

The most common reason is that people doubt that data quality problems could be prevalent in their systems. This “data denial” is not necessarily a matter of blissful ignorance, but is often a natural self-defense mechanism from the data owners on the business side and/or the application owners on the technical side.

No one likes to feel blamed for causing or failing to fix the data quality problems.

All projects should allocate time and resources for performing a data quality assessment, which provides a much needed reality check for the perceptions and assumptions about the quality of the data. A data quality assessment can help with many tasks including verifying metadata, preparing meaningful questions for subject matter experts, understanding how data is being used, and most importantly – evaluating the ROI of data quality improvements. Building data quality monitoring functionality into the applications that support business processes provides the ability to measure the effect that poor data quality can have on decision-critical information.

Ajay- Companies talk of paradigms like Kaizen, Six Sigma and LEAN for eliminating waste and defects. What technique would you recommend for a company just about to start a major BI project for a standard ETL and reporting project to keep data aligned and clean?

Jim Harris- I am a big advocate for methodology and best practices and the paradigms you mentioned do provide excellent frameworks that can be helpful. However, I freely admit that I have never been formally trained or certified in any of them. I have worked on projects where they have been attempted and have seen varying degrees of success in their implementation. Six Sigma is the one that I am most familiar with, especially the DMAIC framework.

However, a general problem that I have with most frameworks is their tendency to adopt a one-size-fits-all strategy, which I believe is an approach that is doomed to fail. Any implemented framework must be customized to adapt to an organization’s unique culture. In part, this is necessary because implementing changes of any kind will be met with initial resistance, but an attempt at forcing a one-size-fits-all approach almost sends a message to the organization that everything they are currently doing is wrong, which will of course only increase the resistance to change.

Starting with a framework as a reference provides best practices and recommended options of what has worked for other organizations. The framework should be reviewed to determine what can best be learned from it and to select what will work in the current environment and what simply won’t. This doesn’t mean that the selected components of the framework will be implemented simultaneously. All change comes gradually and the selected components will most likely be implemented in phases.

Fundamentally, all change starts with changing people’s minds. And to do that effectively, the starting point has to be improving communication and encouraging open dialogue. This means more of listening to what people throughout the organization have to say and less of just telling them what to do. Keeping data aligned and clean requires getting people aligned and communicating.

Ajay- What methods and habits would you recommend to young analysts starting in the BI field for a quality checklist?

Jim Harris- I always make two recommendations.

First, never make assumptions about the data. I don’t care how well the business requirements document is written or how pretty the data model looks or how narrowly your particular role on the project has been defined. There is simply no substitute for looking at the data.

Second, don’t be afraid to ask questions or admit when you don’t know the answers. The only difference between a young analyst just starting out and an expert is that the expert has already made and learned from all the mistakes caused by being afraid to ask questions or admitting when you don’t know the answers.

Ajay- What does Jim Harris do to have quality time when not at work?

Jim- Since I enjoy what I do for a living so much, it sometimes seems impossible to disengage from work and make quality time for myself. I have also become hopelessly addicted to social media and spend far too much time on Twitter and Facebook. I have also always spent too much of my free time watching television and movies. I do try to read as much as I can, but I have so many stacks of unread books in my house that I could probably open my own book store. True quality time typically requires the elimination of all technology by going walking, hiking or mountain biking. I do bring my mobile phone in case of emergencies, but I turn it off before I leave.

Biography-

Jim Harris Small PhotoJim Harris is the Blogger-in-Chief here at Obsessive-Compulsive Data Quality (OCDQ), which is an independent blog offering a vendor-neutral perspective on data quality.

He is an independent consultant, speaker, writer and blogger with over 15 years of professional services and application development experience in data quality (DQ), and business intelligence (BI),

Jim has worked with Global 500 companies in finance, brokerage, banking, insurance, healthcare, pharmaceuticals, manufacturing, retail, telecommunications, and utilities. Jim also has a long history with the product that is now known as IBM InfoSphere QualityStage. Additionally, he has some experience with Informatica Data Quality and DataFlux dfPower Studio.

Jim can be followed at twitter.com/ocdqblog and contacted at http://www.ocdqblog.com/contact/