abstract |
A video caption detection scheme capable of detecting captions from the coded video data which are coded by using a combination of predictive coding and motion compensation, without requiring the decoding of coded video data into frame images. In this video caption detection scheme, whether each pixel/block in the video data is coded by using inter-frame correlation without using motion compensation or not is judged. Then, a region in the video data at which pixels/blocks that is judged as being coded by using inter-frame correlation without using motion compensation are concentrated time-wise and space-wise, is detected as a caption region. The detection can be realized by counting a frequency of appearance of a pixel/block which is judged as being coded by using inter-frame correlation without using motion compensation, at each pixel/block position of a frame over a prescribed counting period, and then comparing the counted frequency of appearance with a prescribed threshold value. |