Tools and techniques - The information portfolio

Just as with applications, information comes in different forms and carries different importance to the organisation. Research has revealed that the two discriminating factors that allow us to consrtuct an alternative portfolio, for information, are:

The information portfolio lays out these two discriminators as a 2x2 matrix, and provides examples in each quadrant:

The information portfolio helps us to deal with these challenges and it organises the way that we might work around them. The notes that follow relate to the numbered elements of the model - 1, 2, 3, 4 - as presented in Figure 26.

Stage 1: Taking advantage of public information

The simplest first step in managing with the portfolio is to recognise and adopt well-structured external schemes of reference data, such as post codes, weather data, GPS positioning data and even travel timetables. It is now routine in the United Kingdom to undertake a web-based transaction using first your house number and then the postcode - the combination of these seven or eight digits and characters identifies your full address immediately, without any ambiguity, and the remainder of the order screen is auto-filled from that raw data.

Other administrations elsewhere in the world are not so advanced and not so rigorous with their post coding schemes that the data is reliable. In South Africa a new road tolling scheme (at the time of writing) relies on the existing national database of registered road vehicles in order to track down drivers or vehicle owners who have used the roads and not paid. The quality of the vehicle registration data is so poor that some experts think the road tolling scheme will collapse, as revenues are falling far below what is needed to pay for the cost of building and operating the tolling system.

Elsewhere, in education, in health and in general administration there are many existing schemes that provide perfectly adequate structuring for data, and not choosing them risks serious problems of compatibility in the future; where there are duplicate or overlapping schemes then we have a problem, of course. The inefficiencies and risks of having to translate one set of external codes (for example, a supplier part number code) into an internal code (for example an internal stock number) can undermine business performance, and lead to a constant struggle to keep things properly lined up and to sort out the avoidable problems that arise. The international effort to organise well-defined structures and codes for illnesses, drugs, educational material, fast-moving consumer goods and even crimes (to mention just a few) are worthwhile and helpful, and should be supported and encouraged.

Stage 2: Tagging the noise on the web

Outside of the boundaries of any single organisation there is a constant flow of potentially relevant information that needs to be monitored in case there is something there that is potentially valuable. The problem is, there is so much of it, mostly on the social pages of the World Wide Web of course.

There are different ways to harvest and organise such data, possibly by using existing schemes such as post codes and GPS data (already mentioned above) but more typically by adding "tags" or by analysing it and fitting it to a prepared scheme of ideas that we might call an ontological model. This effort to organise the vast content of the web is happening now, for example in the project that is known as the "semantic web"; experts are meeting, papers are being written, conferences are happening - we are (at the time of writing) at a tipping point that will affect life for many years to come in ways that we cannot yet anticipate.

Hence, there are two ways to make sense of the web. First, it is possible to construct formal ontologies, and second it is possible to have open ended tagging schemes - these two approaches lead to quite different results. The ontological approach attempts to be highly structured and rigorous, and the tagging approach tends to be completely open, it is able to be manipulated, and it is potentially highly redundant.

Ontologies are a formalised structuring of the "things" that comprise our "real" world. In a research project an ontology will identify the entities (and the relationships between them) that are relevant to the research and about which the project wishes to gather and analyse data (we will return to a discussion of entity modelling again, shortly). An ontological model can be developed using the same kinds of rules that are used in the development of entity-relationship models because (as your author sees them) they achieve a very similar thing.

Stage 3: Sifting and analysing

A research project can usefully develop its own ontological view of what it needs to work with, but in the wider world of business the generalised ontologies that are under development extend to hundreds of entities and hundreds of relations between them. They are quite alien to the majority of business people and web users, who - understandably - prefer to take the line of least difficulty and use the tagging option.

It is possible to devise and apply tags to web content and to other stored data such as photographs and emails, within personal systems or within shared systems, and those tags can be used to quickly select the content relevant to a business issue in a web site, or a photo archive, or a discussion board with tens of thousands of contributions. Exactly how this tagging might actually be done is wide open and some management will be helpful in avoiding complete chaos, rather have "managed chaos" I think. Individuals work with search engines every day, and develop their own lists of tags (or "key words") that work well for them; information consolidators such as the global news providers (many of which simply scour the content of the web for stories written by others) do this automatically using tags and keywords, with just some human intervention to make sure that the best stories are featured more prominently (but that will always be a human judgement, surely?); the sort of face recognition that is now happening (with image management software such as Picasa) is entirely automatic, it uses "tagging" schemes that we do not understand that are highly complex and embedded in image processing, and at this stage they are entirely proprietary, it seems.

One might surmise that at this point in time the average business, or government department, or community, will be happy to wait and see what happens with ontologies and with the semantic web, and go no further than playing with tagging schemes; other progressive businesses and organisations with a special interest in recovering information from raw data will be working closely with these ideas, which are yet to show their true potential in the world of business, in fighting crime, or in manipulating and nurturing groups of people on the social web.

Stage 4: Structuring and archiving

When it finally comes to organising and structuring information within an organisation, something comparable to the value chain would be useful. A high-level view of the organisation, on just one sheet of paper, with a generic arrangement of ideas that lets us compare out business to the way that other businesses work? Sounds good.

Given the importance of information in business today we might expect that some form of universal or generic information model would be evident in the literature. In fact, few existing authors provide an explicit treatment of information modelling and analysis. Other management literature is equally sparse in this matter, and the treatment of information modelling in the specialist information systems literature is unhelpful to general business managers who have no interest in the deeper theoretical issues. Hence, the principal question addressed here is: how can relevant and effective information modelling be rendered meaningful to the non-specialist, especially to managers? This is important to do, because the digital information that is available to managers for decision-making will depend on the design of the databases that contain it, and yet the way that databases are designed is not only difficult for managers to understand, it is also difficult for specialists to understand.

The book presents such a model, intended to assist non-specialists, and there is an extended discussion starting on page 136. The discussion goes on to present a generic information model for an organisation, that can be used to assess the extent and detail of the information that is needed, and to identify the critical transactional information that will tell you whether you are making money or not.