Sign language recognition

Sign Language Recognition (shortened generally as SLR) is a computational task that involves recognizing actions from sign languages.[1] This is an essential problem to solve especially in the digital world to bridge the communication gap that is faced by people with hearing impairments.

Solving the problem usually requires not only annotated color (RGB) data, but various other modalities like depth, sensory information, etc. are also useful.

Isolated sign language recognition

edit

ISLR (also known as word-level SLR) is the task of recognizing individual signs or tokens called glosses from a given segment of signing video clip. This is commonly seen as a classification problem when recognizing from isolated videos, but requires other things like video segmentation to be handled when used for real-time applications.

Continuous sign language recognition

edit

In CSLR (also known as sign language transcription), given a sign language sequence, the task is to predict all the signs (or glosses) in the video. This is more suitable for real-world transcription of sign languages. Depending on how it is solved, it can also sometimes be seen as an extension to the ISLR task.

Continuous sign language translation

edit

Sign language translation refers to the problem of translating a sequence of signs (called glosses) to any required spoken language. It is generally modeled as an extension to the CSLR problem.

References

edit
  1. ^ Cooper, Helen; Holt, Brian; Bowden, Richard (2011). "Sign Language Recognition". Visual Analysis of Humans. Springer. pp. 539–562. doi:10.1007/978-0-85729-997-0_27. ISBN 978-0-85729-996-3. S2CID 1297591. {{cite book}}: |journal= ignored (help)