INNOVATION

Human Language Technology

Probity’s Content Analytics Division (CAD) delivers operational solutions for customers who must quickly find mission-critical information in large volumes of foreign-language, mixed-media sources (i.e., Media Triage).

To satisfy the operational need for effective Media Triage, CAD engineers specialize in building innovative systems that employ best-of-breed Human Language Technology (HLT) from U.S. industry and academia.

As is well-known, off-the-shelf HLT does not perform well in real-world operations that differ from the vast training assumptions and the specific application conditions for which they were developed (for example, digital assistants).

CAD provides the experience and skills needed to optimize world-class HLT components for your specific media types, languages, channel conditions, and Concepts of Operation.

Without the assistance of HLT-powered systems, language-skilled analysts are constantly overwhelmed by the sheer volume, variety, and velocity of the incoming information.

HLT is a vast domain that encompasses all means of communicating through language (Figure 1).

HLT Fig1
Figure 1. The domain of Human Language Technologies

CAD specializes in the part of the HLT domain to the left of the Digital Text “Great Divide” shown in Figure 1, in which non-textual signals are converted into text that describes the media content in a searchable form.   CAD also has deep and broad experience with the technologies on the right side of the Great Divide, in which text is converted to meaning, especially when the text has been automatically generated by HLT technologies on the left side and is therefore, not assumed to be well-formed, grammatical, or relatively error-free.

HLT has made great progress over the last 15 years, powered by advances in ML techniques, commercial business opportunities, and by the availability of massive training data sets. Our customers know that these ML advances do not translate to operational capability out of the box. That’s why they call on Probity’s Content Analytics Division.

Figure 2. The Human Language Technology stack.

In our end-to-end solutions, CAD optimizes these technologies to your operational conditions, your users’ needs, and your mission objectives.

HLT has made astounding progress over the last decade, powered by accelerating advances in Machine Learning, commercial business opportunities, and by the availability of massive training data sets.

However, our customers know that these advances do not translate to operational capability out of the box. That is why they call on Probity’s Content Analytics Division.

End-to-End Solutions

Probity’s Content Analytics Division (CAD) is creating an ecosystem of solutions that support automatic triage and faceted search of temporal media (audio, video, SMS, blogs, etc.) based on their language content.

Transcriber Studio

A triage application for language-skilled analysts who summarize the content of audio sources

  • Navigate audio files visually

  • Sort audio files by amount of speech

  • Select audio files by language

  • Skip over audio regions with no speech

  • Detect files that contain speakers of interest

  • Create summary reports with timestamps

    that are synchronized to the audio