Ministry of Defence

design by Web Labs

MOD
Ministry of Defence (U.K.)

The DIRC is a department within the DCDC which is based at the MOD training camp at Shrivenham training college and has the function of storing military archive and historic documents and information. 

This archive only made available to members of the armed forces upon request. Most of the 0.5m documents are in the form of books and papers about battles and military knowledge. The task of disseminating this interesting information is in the hands of a small dedicated team which have considerable knowledge of the content of the documents and the MOD. The system was dependant upon two things a good record of basic information (MetaData) about the documents and general knowledge about the content of the library and the 'what is where'. The downside is that it depends on the memory and understanding of these document records, and of course no one officer has read the entire library is a expert in all matters so a considerable amount of information is not known or is forgotten. The positive aspects of the system where:

  • Over 500,000 pages of historic document records are held about a wide range of military subjects
  • All records had a good Access database classification system, albeit with inconsistent meta data stored on older records
  • Documents were and are being scanned and converted to PDF electronic format
  • Large number of images and videos also stored in non digital format
  • Could supply documents and images on individual CD Rom disk to order

However some of the limitations where

  1. PDF format was not being OCR'ed so the content was not searchable
  2. All documents were stored in a windows drive share and only the DIRC had knowledge of where documents were held and access to them through the access database
  3. General search was non existant
  4. Nobody outside DIRC could get to documents
  5. Images and Videos where difficult to distribute
  • Provide a system whereby ALL the information could be accessed without depending upon the experience of any member of staff
  • Provide system that could be rolled out for use by a wider audience
  • Provide a system that besides being secure would need no special software on a Users computer and would be easy and intuitive to use by non IT staff

Document management software solution

The solution was Document Master. This was installed on an existing windows 2003 server, and MSSQL 2005 was installed for managing the terrabyte of RAID database storage.

The first step was to scan, classify and insert the data onto the system. It was decided to base the folder structure on the record ID of the classification system which gave an original ID for each document, even non scanned documents have a record and ID created and tagged. Then the PDF documents were imported, deconstructed into individual pages, these pages were then sent to the OCR engine and the resultant text was automatically indexed by the powerful concept based search engine. Finally the hi-res page images were shrunk to a smaller browser friendly size and a thumbnail created by the Document Master asset manager software. This allows users to see all the pages when they open a document and any search highlights directly the page and displays an appropriate thumbnail of that page with a link to the full document. To carry out these task Web Labs deployed four powerful servers to import, convert, classify and store these documents and assets.

The system has been a great success being…

  • Easy to administer with only a web browser and completely accepted by existing staff
  • Simple to Use by Researchers - by providing searching by metadata and more powerful conceptual natural language searching from Search that enables non experts to search the actual content of documents and the returns being the most relevant extracts and related content to assist with further searching
  • Reliable with 99.98% system uptime and availability
  • Cost effective - delivered on budget
  • Timely - Installed and fully working including all of the data transfer and verification staff training and small amendments on schedule and within six months of the order date
  • This system has been running for over a year and is in the process of going through an upgrade to enable other MOD departments to reap the benefit - it is hoped that the system will hopefully move to the next planned phase and be made available to a wider audience possibly outside of the MOD

Tags