I want to wish you all a happy and a prosperous 2013..May it be a success professionally, personally and academically..!!.And in 2013 I hope this blog has better content, bigger reach and ofcourse be more regular..!!.
Getting straight to where we ended our previous post where I mentioned the two kinds of redundancies in a video sequence which aids in its compression namely Temporal and Spatial. The analogy for the Temporal redundancy was that of the redundancy existing between the items which go into our everyday carry bags to schools and offices. I am moving on to Spatial Redundancy and this post will be a discussion of the concept and ofcourse an analogy to help us decipher it better.
The Concept
Spatial Redundancy is the redundancy of information which exists within the same frame i.e. it is an intra frame redundancy. The numerical similarity of pixel values between frames which we discussed in the last post also exists within the space of the given frame. So, spatially a frame contains pixels which have similar/near similar values to their adjacent neighbors.
Conceptually if this spatial correlation or the redundancy among pixels is exploited, then PREDICTIONS can be made about their adjacent neighbors with a reasonable/acceptable accuracy. And hence because of this correlation (leading to better predictions) we ll have lesser information/data to be encoded and transmitted which leads to video frame compression. Unless it is not a complex/complicated section of the picture frame, this redundancy can be efficiently used in compressing a video sequence.
Conceptually if this spatial correlation or the redundancy among pixels is exploited, then PREDICTIONS can be made about their adjacent neighbors with a reasonable/acceptable accuracy. And hence because of this correlation (leading to better predictions) we ll have lesser information/data to be encoded and transmitted which leads to video frame compression. Unless it is not a complex/complicated section of the picture frame, this redundancy can be efficiently used in compressing a video sequence.
I ll consider the same two examples as the last post but now talk about its spatial redundancy,
Example
- In video telephony, the adjacent pixels in the face/eye region have similar values and have minimal (numerical) difference among them. So here there is a lot of spatial information which is redundant and
- In high motion sequences like a soccer match, there are enough details in a given frame which are uniform and smooth like the crowd seated. So, the prediction of the pixels from their neighbors becomes easier and efficient in these regions since it has information which is almost the same across it.
The Analogy
We usually can infer/predict lots of information about a given population based on its geographic location. This inference is possible because of an inherent (and usually correct) assumption that people of a geographic location have similar/near similar body features, speak the same languages and have very similar food habits, possess the same shopping trends etc etc. Lets us call this Geographic Redundancy..
If the location being considered does not have a diverse or a complex (unlike a NYC or a Shanghai or a Bangalore) population, I need to only consider a small population and can rely on prediction for the rest..!!. This implies for areas like Wyoming, Nebraska, Eastern India etc, I can afford to make predictions based on neighboring population thus exploiting geographic redundancy.There might be differences/anomalies but they are very minimal when you look at it as a whole.
If the location being considered does not have a diverse or a complex (unlike a NYC or a Shanghai or a Bangalore) population, I need to only consider a small population and can rely on prediction for the rest..!!. This implies for areas like Wyoming, Nebraska, Eastern India etc, I can afford to make predictions based on neighboring population thus exploiting geographic redundancy.There might be differences/anomalies but they are very minimal when you look at it as a whole.
Example
- Classic example is the famous Starbucks and the way it expands its business. It relies on the sameness/redundancy of opinions of people in a given geographic location and makes predictions for the remaining population helping it to make a quick/inexpensive decision thus compressing its time and resources invested..(The same Coffee chain studied the Indian market and consumer habits for more than an year, given its complexity..!!!!)
- If the Indian government decides to open a Cricket Stadium (Cricket is less of a sport and more of a religion in India), it can do so with surveying a limited population..So what do you think resulted in this reduction/simplification..??...Yup you have guessed it right..They can talk to a few people and easily predict the opinion of the adjacent neighboring population..!!.
"The values of the pixels can be predicted from its neighbors assuming spatial redundancy in a video frame very similar to the way we all make predictions about people living in a certain area assuming geographic redundancy..!!! "
This attempt is to attach the entire process/knowledge of Video Compression to something all of us do and know of and comes from the real world..!!..I am hoping these couple posts and then moving onto studying Wikipedia, white/technical papers and all the other material available online might help to appreciate and digest the knowledge..I apologize if there are any mistakes or if there are any factual errors..I would be happy to make the necessary changes..
Programming is next..!!..Let's revisit the very fundamental building block of any programming language..DATATYPES..!!!
I have started a dedicated facebook page for this blog to ease the whole process of commenting and sharing stuff..
Keep Simplifying,
Sunny