| Information processing is a business process that | | | | (from the chamber of Commerce) but a human |
| resembles a normal production process with | | | | eye will notice that both refer to the same |
| familiar demands for managing both the quantity | | | | company. |
| processed as well as the quality of the output. | | | | As there are many other fields in which company |
| For many business processes there is a | | | | information can differ (between the two sources), |
| continuous pressure to increase the output. There | | | | the (batch) process can calculate a value to each |
| is also constant demand for quality which acts as | | | | company date that represent the level of |
| a brake on this main process. In the information | | | | matching, where a perfect match is represented |
| processing area this problem is solved by using | | | | by a 100%. |
| two different types of processes; batch and | | | | The output of search engines work in the same |
| online. The quality indicator is the mechanism that | | | | way. Search outcomes are sorted according to |
| will define how much the output is lowered in | | | | this "match indicator level;" the lower the indicator |
| order to increase the quality (of the information). | | | | the lower the quality of the match. |
| An example of how this is done in practice you | | | | This type of indicator can be used as a selection |
| could imagine The Yellow Pages. The books | | | | mechanism between batch and online activities. |
| contain a variety of information about companies | | | | For instance by using a rule that all matches with |
| an each month a book is published with a selection | | | | a level below 60% should be controlled by an |
| of companies in a certain region. This cycle | | | | agent. |
| continues until the last region of the country is | | | | In this way the quality of the output is managed; |
| handled after which the book publishing process | | | | the batch process to increase the level of output, |
| starts all over again with a series for the next | | | | the online part with human interference to check |
| year. Publishing these books is a process that | | | | and increase the quality. |
| requires quite some organizing; most important is | | | | This same technique could be (and perhaps is) |
| that the information is correct. Yet companies | | | | used by managing article websites. When an |
| (and company information) do change a lot. To | | | | article gets submitted to the site there are a |
| maintain this information the company information | | | | series of check required, which could also be done |
| in the data base needs to be checked with | | | | by a batch process. These check have also a |
| information from third parties (for example from | | | | matching mechanism in them where (parts of) |
| the chamber of commerce). | | | | the article is checked against existing content on |
| Efficiency is important when organizing these | | | | the Web. This could give a resulting match level |
| activities and information systems can help to | | | | indicating the probability of the article authenticity. |
| organize this by separating batch from online | | | | For yet other examples think about the IRS; all |
| activities. Batch obviously is a completely | | | | tax contributors are assigned a credibility indicator |
| automated process, online is where human | | | | that is calculated in a batch. The indicator is |
| interaction is required. | | | | derived in matching various other sources (banks |
| For example, in the batch process the company | | | | and other financial institutions) and the way in |
| information in the database can be compared with | | | | which the tax form is filled in. |
| this third party data. The batch provides a | | | | What remains to be done in these environments |
| selection of companies where the match (of the | | | | is to define the quality level; how much do you |
| two data sources) is less than 100 percent. This | | | | dedicate to batch processing and how match to |
| means that human interaction is required to | | | | online processing. This allocation question is what |
| (visually) check whether the third party | | | | defines much of the quality of the overall output. |
| information is significantly different from the base | | | | It is about the question; "at what level (of the |
| data. "Microsoft Corp" (the base data) on one | | | | indicator) are you going to check?" That's all up to |
| hand will show a difference from "Microsoft Inc." | | | | you. |