A document (PDoc) will have one or more pages which are stored as PDocPage objects and used by DocVacEnterprise.  A simplified version of each page is represented by a PDocPageNc class and each one of these stores the underlying text content of the page after extraction.  Certain cleanup logic is applied to the text to ready it for processing.  While DocVacBasic does not utilize this text, it is used by DocVacEnterprise in a variety of ways, including the classification of which pages in the document are more important, the presence of key words in the document (much more efficient than reviewing individual PDocRowXml cells) and identification of a document as a specific document type or as being produced by a specific company.  It is exposed by the web service GetPDocPageNc to allow the user to perform additional analysis on the raw text of one or more pages.


public class PDocPageNc
        public int PDocPageId { get; set; }

        public int PDocId { get; set; }

        public string DocText { get; set; }

        public int PDocExtractionModeId { get; set; }

