Smoothing Out Square Pixels

Chris Durham · July 7, 2010

Interesting article in Wired about how the inventor of the pixel (or one of them at least) is addressing the problem of square pixels 50 years later.

http://www.wired.com/wiredscience/2010/06/smoothing-square-pixels

The low-resolution examples given are compelling and I'd be interested to see how this might translate into more natural images in higher resolution images.

Thomas James · July 7, 2010

Yes I mentioned this to Jim Jannard a few months ago that because his Red camera uses square pixels it will never be capable of capturing a natural looking image and needless to say Jim got very defensive as if I was attacking him personally. Although my criticism was frank it was never the less constructive criticism however the scientific research needed to alter the pixel structure in favor of more exotic tile mosaic would probably be very costly so it would be more cost effective to suppress a major advancement in technology than it would be to embrace the technology. In the meantime film with its random graoin pattern captures a more natural looking image than digital.

Chris Millar · July 7, 2010

er...

I'm sorry but where in that article was a solution to square pixels addressed ? (other than that of the inferred 'more resolution' = the solution)

:huh:

They're talking about enhancement - discussion that sounds eerily familiar to discussion related to compression, and which started pretty much straight after the photo of that baby was taken.

Bottom right image - square pixels

Most compression and enhancement algorithms will work very well under certain conditions but then fail terribly once 'off axis' - busy images like that of a persons face will hide these failures and also take advantage of the fact that our own sensor (eyes, brain) are programmed themselves to add more emphasis to things that are important (or at least were) to our survival, edges are a big factor as is facial recognition...

piff :rolleyes:

Thomas James · July 7, 2010

Exotic tile mosaics of pixels can be simulated using conventional square pixels yet this approach ends up using a lot more pixels than if a dedicated anthropomorphic retinal sensor were constructed.

Chris Millar · July 7, 2010

:lol:

The solution to square pixels = more of them

:o

Thomas James · July 8, 2010

But it can be very inefficient to use more square pixels. With square pixels you need a 6x6 block of 36 sub pixels when 2 triangle shaped pixels ( a 45'x45'x90' and a 30'x60'x90' )can do the job. But this may not be a problem because if you shot an image using a sensor of 36 million sub pixels or 8K x 4K the compression engine would only have to handle a 2K image. So Jim Jannard could easily introduce his 8K Epic camera with a compression algorithim thats less computationally intensive than his present 4K Red camera

Thomas James · July 8, 2010

If you look at the image carefully the enhanced image does not have more pixels but rather more sub pixels.

Chris Millar · July 8, 2010

If you look at the image carefully the enhanced image does not have more pixels but rather more sub pixels.

It's a nomenclature issue that's all - by 'pixels' you mean original information 'pixels' (?) and 'sub pixels' are the newer generation required represent the triangles/whatever (?) still presented on a square grid mind :rolleyes: and pulled out of thin air - the apparent increase in information is nothing but a trick of visual perception, if the same trick were to be used say on text it would likely be a complete garble... Notice how the straight line of contrast under his left ear, between the shadow of his collar and his neck has been rendered into a zigzag ? FAIL

anyway, back to triangles - ok, so now we have triangles (right triangles? and at what orientation and tiling with respect to the aspect and neighbouring um, 'trixels' ?) - imagine they are just splitting the normal pixels in two - what about 22.5Deg angles in the image then ? and so on ...

Not sure if there is a geometry that doesn't fail in some situation - maybe use a random regenerating/positioning 'pixel' ?

like... film grain ? :lol:

My original grip was targeting the Wired article - they used before and after pictures a little cheekily ... I think we're mostly on the same page though here in understanding

I thought you were a temporal res junkie Thomas - what gives ! heh heh

Thomas James · July 8, 2010

To be classified as a pixel each pixel would have to have their own independent brightness or grey value. Since these groups of subpixels all have the same grey value they are classified as subpixels and it takes 18 subpixels to make a single pixel. Subpixels enhance picture quality by forming different shapes such as triangles that match the contours and curves of real life better than squares can. So the result is a more natural looking image.

When discussing temporal resolution we introduce a concept of the third dimension which is actually the fourth dimension but since our picture is flat time can be considered the third dimension. With the third dimension our square pixels become cubical boxes and each frame is actually a pallet of boxes stacked on top of each other. Each box is called a voxel rather than a pixel. With long 12 group of pictures MPEG-2 compression we have 2 I frames each second which means the background which does not move is shot at 2 frames per second while the action is shot at 24 frames per second. Thus for the background a voxel is composed of 12 cubical subvoxels stacked on top of each other forming a column. However if triangular pixels are used this translates into triangular prisms when temporal resolution is considered. However the matrix of the time space continum can be divided in any way desired and pyramids or cones can be used as voxels.

Chris Millar · July 8, 2010

To be classified as a pixel each pixel would have to have their own independent brightness or grey value. Since these groups of subpixels all have the same grey value they are classified as subpixels and it takes 18 subpixels to make a single pixel. Subpixels enhance picture quality by forming different shapes such as triangles that match the contours and curves of real life better than squares can. So the result is a more natural looking image.

When discussing temporal resolution we introduce a concept of the third dimension which is actually the fourth dimension but since our picture is flat time can be considered the third dimension. With the third dimension our square pixels become cubical boxes and each frame is actually a pallet of boxes stacked on top of each other. Each box is called a voxel rather than a pixel. With long 12 group of pictures MPEG-2 compression we have 2 I frames each second which means the background which does not move is shot at 2 frames per second while the action is shot at 24 frames per second. Thus for the background a voxel is composed of 12 cubical subvoxels stacked on top of each other forming a column. However if triangular pixels are used this translates into triangular prisms when temporal resolution is considered. However the matrix of the time space continum can be divided in any way desired and pyramids or cones can be used as voxels.

I know about voxels - I made a voxel reslice program earlier this year where you can animate and deform the cutting plane, using a plane in Maya as the GUI - reslicing/voxel sorting done in Matlab:

Works just as well with video as it does say CT scan/DICOM data - as you know apart from the data container and compression etc... there is zero real distinction between video pixels and voxels (3D, medical, whatever)

I have made an audio version of it, which conceptually is a combination of both vocoding and wavetable synthesis - and have also figured out a away to add another dimension/axis to the voxel/video slitscan effect, even cooler than using it on 3D footage (which is interesting, you need temporal parallax as much as spatial, weird huh) more work required however ;)

Help me understand - what use is a subpixel if it doesn't have its own independent brightness or grey value (which would define it as a pixel again)... If a triangular subpixel had the same value as the one next to it you wouldn't see it :huh: ?? Perhaps you mean they are they only used by the algorithms ?

Edited July 8, 2010 by Chris Millar

Chris Millar · July 8, 2010

However the matrix of the time space continum can be divided in any way desired and pyramids or cones can be used as voxels.

hmmm, how do you create a symmetric system in 3D other than squares ?

Equilateral pyramids are half squares - but the symmetry is off axis ...

Tetrahedrons array nicely - but are offset from each other, so wont fit into a square repeating system.

Cones, interesting, howabout double ended cones ? - so the image intensity would relate to the area of the last pixel info as it truncated less and less, all the while the new frames cone is truncating more and more, until it reaches its maximum pitch and so on ...

I'm not thinking too hard - feel free to correct me !

Thomas James · July 9, 2010

Subpixels have value even if they do not have independent brightness or grey level values because by grouping they can be arranged in shapes that more naturally conform to the contours of reality. Square pixels on the otherhand would be great if people looked like block heads or Sponge Bob Square Pants but we know this is far from the case. Thats why if you are an artist assembling tile mosaics you have an advantage if you access to other shapes besides squares.

On the other hand if you are working in three dimensions you are not assembling tiles but rather Lego blocks. However other shapes are available besides cubes. Triangle based pyramids complemented with square based pyramids can fill all space but tetrahedrons alone cannot fill all space. A cube can be also be assembled from 6 square based pyramids.

A balloon that is inflated can be represented by a single voxel in the shape of a cone or a hexagon based pyramid. However the same image would take 108 times the resolution if you used square based pixels.

Chris Millar · July 9, 2010

Subpixels have value even if they do not have independent brightness or grey level values because by grouping they can be arranged in shapes that more naturally conform to the contours of reality.

Yes, but if you cannot see the boundary between then as they have no distinguishing value then they are redundant !?

To 'arrange them in shapes that conform to the contours of reality' you have to assign them values, or how do you see them ? - in which case they are now defined as pixels (by your definition)...

Please think about my point and address it rather than repeat yourself...

Now - back to voxels - yes, you're right about tetrahedrons:

$Oct-tet%201.gif$

took a bit of mental visualisation to see it

Using 8 Equilateral pyramids per square results in 4 pyramids that occupy more 'bulk' temporal sub space - so if the data was interpolated you've now got double the amount of frames to deal with also - heh heh, not an issue for you 50fps lovers... However, If it were recorded in that fashion in the first instance you'd be recording images with the 1-pixel offset between each frame, imagine the resulting aliasing buzz on hi frequency spatial information, hmmm, maybe it'd actually work in favour of minimising it (?)

The ballon - got a picture of that ?

The issue with pixel shapes/compression algorithms and this kind of carry on is that there will always be an ideal case that can be trundled out and used to prove the superiority of whatever system you're trying to sell - observe:

A cube can be represented perfectly by one cubic voxel - therefore cubic voxels are the best option

Keith Walters · July 9, 2010

I think you guys have gotten hold of the wrong end of the stick, and so has the author of that article.

In old-fashioned non-compressed digital imaging systems, you had to somehow convey, transmit and/or store two sets of information for each pixel: its location on the screen, and the brightness (and colour) that it appears in that location.

Compression systems such as JPEG use various mathematical tricks to reproduce an approximation of the original set of pixels, but in the end, at some point the original raster of pixels has to reproduced.

Two things remain immutable: if you want say a 1920 x 1080 pixel source raster to be accurately reproduced on a 1920 x 1080 display, you have to start off with at least a 1920 x 1080 pixel imaging device and at least a 1920 x 1080 pixel display device.

What Kirsch is talking has nothing to do with the physical shape of the display pixels, he's really talking about new methods of compression. In his concept, instead of individually defining each of the 2,000,000-odd pixels in terms of location and emission or reflectance values, the compressed data consists of formulas for defining irregular areas with the same brigntness/colour values, and then the values they contain.

This is not exactly new, since it forms a large part of the GIF image compression process, and is one of the algorithms used in MPEG4. You can sometimes see the effect of this in heavily compressed MPEG4 (DIVX in particular), in the so-called "wet paint" error, where a moving object or person accidentally picks up a blob of spurious colour and carries it around with it.

Chris Millar · July 9, 2010

I think you guys have gotten hold of the wrong end of the stick, and so has the author of that article.

I understand it Keith, or at least I'd like to think I do (in my own way) - it's just we've gone off topic, so it's understanable that youd think that...

That being said the original article was having a hard time keeping on topic itself

Oh yeh and on occasion I enjoy (and provoke) Thomas. I was really just pushing to get some somment on my little voxel reslice program at that vimeo link - ha ha :P

Edited July 9, 2010 by Chris Millar

Thomas James · July 11, 2010

What Kirsch is talking about has everything to do with the shape of the pixel. However cameras and display devices with exotic pixel shapes only exist in the laboratory and are not commercially produced. Therefore triangular shaped pixels can only be emulated using 6 x 6 blocks of square sub pixels. What one must remember is that each of these groups of sub pixels that form triangular pixels do not produce 2 dimensions of information but rather only describe the shape of 1 dimensional perimeter of a 2 dimensional object and ignore what is inside the object.

Phil Rhodes · July 11, 2010

the compressed data consists of formulas for defining irregular areas with the same brigntness/colour values, and then the values they contain

Doesn't that generalise to more or less the same meaning as "image compression"?

Thomas James · July 11, 2010

Yes but the crude forms of MPEG compression use macroblocks which can be up to a 16 x16 block and they are always squarish in shape and look horrible when the picture goes to hell due to over compression. What we have is the concept of macrotriangles which like macroblocks can take advantage of the fact that in many cases adjacent pixels can have the same values yet conform much better to the natural contours of reality better than any square pixel can.

Keith Walters · July 11, 2010

[/size][/font][/color]

Doesn't that generalise to more or less the same meaning as "image compression"?

It's one of many algorithms/techniques used.

If you want to be pedantic about it, interlace is a form of image compression.

So is the Run-Length-Coding as used by early space probe and Fax machines.

The point that seemed to have gotten lost somewhere is that no matter what sort or shape of pixel winds up on the viewing device, it's still going to have to be made up of smaller, fixed-position pixels.

It's rather like Wavelet compression. No matter how much high-flutin' mathematical jiggery-pokery is invoked by certain people (who I strongly suspect wouldn't have a hope in hell of writing a two-page overview of how actually works, and that is to say, an actual explanation, not yer average clod high school student's homework assignment/Wikipedia entry consisting almost entirely of endless buzzwords and unexplained and almost completely inacessible references), at the end of the chain, your damned wavelets still have to be defined as a raster of, fixed, immobile, pixels!

Horrible, sharp-edged, finger-slicing, artistic-verisimulitude-destroying square (or rectangular) PIXELS!

Edited July 11, 2010 by Keith Walters

Chris Millar · July 12, 2010

...

The point that seemed to have gotten lost somewhere is that no matter what sort or shape of pixel winds up on the viewing device, it's still going to have to be made up of smaller, fixed-position pixels.

...

Horrible, sharp-edged, finger-slicing, artistic-verisimulitude-destroying square (or rectangular) PIXELS!

Post number three - I trumped you both in time and by using bold :rolleyes: :lol: