Tuesday, October 22, 2013

Quality is a side-effect not the goal in MDM

I put out a tweet the other day with this title and I think its worth elaborating on what I mean.  Lots of MDM efforts I see have the goal of 'improve data quality' and this is a mistake.  I'm not saying that data quality isn't a good thing but that in itself its not actually a goal.

What do I mean?  Well lets take an analogy or three, if you are looking to buy a diamond then do you buy the very, very best and the very very biggest?  If you are looking for diamonds to use in cutting or industrial grinding then the answer is of course 'no', the quality really wouldn't be appropriate in those uses it would be a waste of money.  What if you are looking to put a 1.6 litre engine in a car aimed at the local commuter market, do you look around for the most powerful, most expensive, built to the highest quality standards?  Well that would probably be one of the engines slated to go into a Formula 1 car next season.  Sure it generates huge power and is a quality engine but its not fit for purpose.

Now for the final analogy.  You are looking to provide translation services for the Iranian nuclear discussions.  Should you go and get the cheapest price from someone who promises they can speak 'Iranian' or do you invest from someone who actually is proven as a translator for Persian, Gilaki and Mazandarani and describes their Kurdish as 'passable'?

The point here is that in each case the goal defines the level of quality required, quality in itself is about having an acceptable level of quality to meet your goal which in some occasions might be very little indeed.

So what is the real goal of MDM?  Its about enabling business collaboration and communication the power of MDM is really in the cross-reference, the bit that means you know the customer in one division is the same as another and that the product they are buying is the same in two different countries.  If the quality is awful but the cross-reference works then in many occasions you don't need to invest more in quality unless there is a business reason to do so.  Most of the time that business reason is that you cannot achieve the collaboration without having a decent level of quality.  To match customers across business areas requires you to have an standard definition, so your customer on-boarding needs a certain level of rigour, your product definition needs to work to standards that are agreed across the business.

So in focusing on the collaboration, in focusing on where the business wants to collaborate you focus MDM and you focus where quality needs to be achieves.  Focusing on quality as a goal is a very IT centric thing, focusing on collaboration and through that enabling quality is a business thing.

And MDM is certainly a business thing. 

Friday, October 11, 2013

Single Canonical Fail

There are few things out there in IT more delusional than the Single Canonical Form, the idea that IT can define a super schema, a schema so complete, so pure that all will bow down before it.  Sheer idolatry.  Whether it is for integration or for Data Warehousing the reality is that a Single schema is never going to be ‘canonical’, different people have different perspective and its this very contention between business areas that actually drives the business forwards.  Sales obeys the rules from Finance in certain areas and in others is in open rebellion as the KPIs for Sales compete against the need for regulation in Finance.  To forecast correctly means rigor and repeatability, but anyone who phones up Sales with an open checkbook is going to find their order fulfilled despite the claims of a sales process.

At the heart of a Single Canonical Form is a simple premise ‘as long as everyone can agree’ it’s the sort of premise that is wonderfully na├»ve in its inception. The reality is sadly that such a simplistic view ignores local perceptions and attempts to force a straight-jacket upon the business by providing a single, almost Stalinistic, view upon them.  By starting with that beguiling premise IT sets out on a journey that can only end on failure.  The Sales, Finance and Operations teams all have local KPIs, different division and regions have different strategies and all may have a different view on how they sell and whom they sell to.  This does not mean the business is dysfunctional or wrong, it simply means that the business is complex and not constrained within a single view of what should be done.

The Single Canonical Form aims to achieve the unobtainable and by doing so creates its own downfall.  Because it doesn’t meet the objectives of everybody then individuals are forced to create their own local solutions as the agility of the single canonical form is relatively, or indeed astronomically, low.  The goal of a single canonical form is to create a single view on one of the most variable things in a business: the view on information.  One part of the business may require only 10 pieces of information about a customer, another 200, neither are right or wrong it is simply their own local information, they critical element is that an individual customer be recognized across the two, not that 210 attributes be agreed.  The same goes for invoices, orders, contacts and everything else: agree when it matters, don’t bother when it doesn’t.


The Single Canonical Form is a straight-jacket on the enterprise, it’s a dumb idea based on an unachievable idea.  Its time for IT to grown up and work differently.

Thursday, October 10, 2013

Speaking in Public isn't private - and the internet is a public space

With all the scandal around Edward Snowden I have to say I'm mostly in the camp of 'surely everyone knew that spying agencies spied on people?', but the most surprising is when it comes to the 'scandal' that they might be listening to our communications over the internet.

The internet?  The one created and originally funded by DARPA?  The open internet with the IP protocol that means packets are by default openly routed and unencrypted?

Or do you mean SMTP with its unencrypted openly routed emails?

Or HTTP with its unencrypted data?  Even HTTPS only encrypts the data, you can still openly find out what page someone was looking at in terms of the IP address just not the data being exchanged.

Seriously did anyone actually think that people are not watching this stuff?  When did this become a surprise?  I knew that 30 years ago people were spying on this.  Didn't we all have a .sigs back then that said 'Hello to my friends in Cheltenham and Langley'?

Agencies have spied on people, even allies in fact maybe ESPECIALLY allies for hundreds of years.  Its the whole point of funding spying agencies and the internet just makes that spying easier.  Its a public conversation that you are having on Twitter, Facebook or over email.  You might think its private but its really just shouting out of a window and public speech isn't something you should be surprised if its being tracked.

This isn't a question of 'only the innocent have nothing to fear' but its a question that actually by bringing this more into the open we risk it being used beyond its current scope of spies and into regular police forces and its that which would scare me more.  The whole risk is that spying becomes a mainstream police activity not something from specialist organisations whose primary focus is on genuine national threats, not someone forgetting to put the garbage bin out on a Tuesday night.

Spying is real, it has been for hundreds of years and its sadly got a place in a world of cyber and other terrorist threats.  That people have got upset about the tracking of information over public networks says much more about the lack of understanding of how the Internet works than of a big brother state that is completely new.