Key Term Search with Wildcards



You can wildcards to a key term to refine your searches of XML data.  We'll describe the character set used by DocVacBasic here; DocVacEnterprise has two additional character sets to increase the flexibility of searches.

[*] = wildcard e.g. keyTerm = [*]COMPANY finds any XML cell that ends with COMPANY, keyTerm = [*]COMPANY[*] finds any XML cell that contains COMPANY.  Because this searching is computationally intensive, DocVacBasic users are limited to 6 wildcards in total.

[#] represents a numeric character, so keyTerm = HELL[#]O  would find a match on XML cells containing HELL2O and HELL9O but not HELLLO

[l] a lowercase L represents a letter, so in the previous example keyTerm = HELL[l]O would find a match on HELLLO but not the first two examples containing a number in the specified character location.

[?] finds a match on any single character, so in the previous example, all three cells would be matched with keyTerm = HELL[?]O

[1L] finds a match if single character is found in the list, in this case either 1 or L.  So keyTerm = HELL[IL]O would find a match on HELL1O and HELLLO but not HELL2O

! negates the match, so key term = HELL[!1L]O would find a match on HELL2O but not HELL1O or HELLLO

Wildcards are particularly useful with documents extracted using OCR where individual characters tend to get confused.  A capital i = I and the number 1 are an example of one such confusion pair, so a search like [I1]NSURANCE is a little more forgiving than simply searching on INSURANCE.  DocVacEnterprise allows an override to be set in the Key Term Group so that if 1NSURANCE is found, it is returned as INSURANCE.

When performing raw text searches without wildcards i.e. simply trying to find a specific word, it is much more efficient to use raw text searches as described here


Last modified: 6/18/2018
Other articles:
Anonymous Mode Email
Web Services - Ws - PDocDetailApi.GetPddList
Combining Multiple Docs into One Doc
Billing - DocVacBasic & DocVacGold
Setup Docs
Excel to Consume Web Services
CSV Files
Financial Statement / Table Extraction
Key Term Search with Wildcards
DocVac Dictionary of Jargon

more