Time to streamline your data capture architecture

Intelligent document processing (IDP) in Switzerland for customer onboarding, invoice recognition or age verification


Last update: 15 February 2025


Topics


Artificial intelligence automation

Automatic document redaction

Automatic invoice recognition

Automatic age verification

Business Analytics (BA)

Business Intelligence (BI)

Business Process Management (BPM)

Computer Vision (CV)

Dark data discovery

Data capture & machine indexation

Machine Learning (ML)

Optical Character Recognition (OCR)

"Intelligent document processing converts raw image data into information, and information into business insights"

Definition of intelligent document processing (IDP)


Intelligent document processing (IDP) is essentially the use of artificial intelligence capabilities such as optical character recognition (OCR), machine learning (ML) and computer vision (CV) to extract and capture business-critical information from everyday routine documents (passports, IDs, invoices, forms) in order to enrich and optimize subsequent enterprise workflows.


Generation of business intelligence during upload or scanning


360core converts raw image data into information - and information into business insights. During external scanning services (here) or the equivalent OCR capture process, our system architecture conducts numerous high-impact data enrichment operations:


  • Automatic text recognition so that document text is automatically indexed, making it easy to subsequently search and retrieve individual words or entire phrases via a full-text search
  • Automatic document type recognition (invoice, contract, email, passport) through computer-aided image recognition, pretrained on Swiss identification documents and business correspondence
  • Automatic indexing of document content as per customer folder structure (so-called keywording) to speed up subsequent full-text search (customer, patient, pupil, student, client, policyholder, tax subject, property, asset, unit, department, cost center, case file, project, service case etc.)
  • Automatic indexing of business-relevant real-world document content such as customer and supplier names, brand names, dates, country and city names, events, currencies, interest rates or company-specific identifiers defined by the customer (supported languages: English, French, German, Italian, Portuguese, Spanish)
  • Automatic recognition of the dominant document language for onward transmission to employees with the required language skills. As an archive grows, customers can filter the entire repository by language (100+) to simplify future use cases.
  • Automatic recognition of personal data under the Swiss data protection act (FADP): Proper names, addresses, phone numbers, emails, ID numbers, dates of birth, ages, insurance numbers, TINs, license plates, credit card numbers, IBANs, URLs, IP addresses, passwords
  • Masking of extracted personal data in a single keystroke to anonymize sensitive and classified information when enforcing compliance requirements (CID, PII) in education (student records), the financial sector (account numbers), public administration (social security numbers), healthcare (biometric data), law enforcement (witness protection) and legal & judicial proceedings
  • Conversion to the PDF/A convention (A stands for archivable) for long-term retention
  • Audit trail: virtual printing of a document number and an OCR timestamp


What are the benefits of accurate OCR?


Optical Character Recognition (OCR) is essentially an AI technology that converts an image of text (be it handwritten or printed text, a scanned PDF document, a jpg or png image file) into a machine-readable format to make it searchable by word processing software (here).


OCR hits a central nerve in the digital transformation of Swiss enterprises because most business processes such as financial accounting, client onboarding, and public administration workflows still involve large volumes of printed paperwork that must be archived and classified for evidentiary purposes.


Situations in which such documents need to be unarchived at relatively short notice include:


  • Swiss tax audits: VAT audits, payroll audits, unemployment audits, WHT audits
  • Government enquiries: Regulatory investigations, enforcements, on-site visits
  • Civil proceedings
  • Criminal proceedings
  • Enquiries from data subjects under privacy laws
  • Briefing of colleagues during case file handovers


Accurate OCR technology is of paramount importance for good data and subsequent look-up quality. 360core uses leading OCR technology for all use cases (screenshots, handwriting, scanned business artefacts).


Specifically for business and accounting documents, 360core utilises the most accurate OCR technology currently on the market, as independent OCR accuracy tests have shown.


What is 360 Autoindexing?


A French proverb says: "In a library, a misclassified book is a lost book". The same applies to digital data storage. Poorly indexed PDFs are as good as lost and make the initial effort of scanning and storing practically useless.


Our proprietary automatic indexing solution ("360 Autoindexing") minimizes data entry errors during document classification. It is especially effective for standard forms ("field-based indexing") that have a consistent data structure and page layout, such as emails, invoices, IDs and passports where specific units of information ("data points") are consistently located at the same place.


Indexing key document identifiers then enables near instantaneous document retrieval through text-based searches. When setting up new cloud archive instances for our enterprise customers, we first investigate which document fields or identifiers the company processes - and which are hence useful for tagging and indexing. In fact, a well-indexed system enables quick and reliable document retrieval which can be crucial during time-sensitive compliance audits or legal disputes.


In highly regulated industries such as financial services and healthcare, indexing quality can become a crucial metric for risk management and information compliance.


1. Automatic detection of invoice data


The current state of the art makes it possible to import data points from receipts and invoices directly into the accounting software or payment gateway for human verification. During this process, our systems abstract from a given invoice's form, layout, language or country-specific characteristics.


In practical terms, this means that invoice amounts, reference numbers, currencies and addresses of beneficiaries no longer have to be laboriously typed in by hand. With up to 10 data points required to set up and process a payment transfer, it also means less failed payments due to data entry issues and human mistakes ("fat-finger errors").


Our invoice data capture pipeline extracts with a very high confidence score the following attributes from scanned or uploaded accounts payable or accounts receivable invoices and passes them over to you as metadata:


  • Supplier and customer details: name, street, postcode, town, state, phone, website, VAT number
  • Invoice details: date, number, date of order, due date, delivery date, purchase order (PO) number
  • Goods delivered or services provided: product code, product or service description, quantity, unit price, total price
  • Breakdown of payment: payment terms, amount due, amount paid, subtotal, total, amount of VAT, service charge, gratuity, prior balance, discount, shipping and handling


2. Automatic detection of passport data


The automated capture of personal data for identity verification (passports, IDs, driving licenses) is of vital importance in customer enrolment workflows, for example in order to establish the beneficial owner during the digital onboarding of customers to open a personal bank account in Switzerland.


The emphasis here is on the quality of the master data identified during ingestion in view of its subsequent dissemination in real time to peripheral systems to inform various downstream processes, from fraud prevention to sanctions screening and risk scoring in general.


When scanning or uploading onto our archive layer (here), our architecture extracts the following measurements from passports and ID documents, and transmits them to our customers as metadata:


  • First name
  • Middle name
  • Last name
  • Date of birth
  • Country of issuance
  • Date of issuance
  • Date of expiry
  • Number of passport or ID

Use Cases

Automatic recognition of passport and IDs (onboarding, age verification)

Extraction of document content from application forms and CVs

Use case: customer onboardings and registrations

Automated processing of contracts, invoices and financial statements

Use case: due diligence and accounting workflows

Mass indexing of archives, postal correspondence, emails and attachments

Use case: bulk archiving and e-discovery scenarios

Introduce this solution in our organization

Share by: