Jump to content

Let's talk about linear to log, A-to-D in digital cameras


Charles Zuzak

Recommended Posts

Digital cameras can do some amazing things nowadays considering where they were just even five years ago. One thing I sometimes struggle to understand is how these newer cameras with 13+ stops of dynamic range are actually quantizing that information in the camera body.

 

One thing we know from linear A-to-D quantization is that your dynamic range is a function of the number of bits of the converter chip. A 14-bit ADC can store, at best (and ignoring noise for the moment), 14 stops of dynamic range. However, when we do introduce noise into the mix (sensor, transfer charge, ADC, etc.) and linearity errors, there really isn't 14 meaningful stops of dynamic range. I did a lot of research on pipeline ADCs (which I believe are the correct type used) and the best one I could find, as defined by the measured ENOB (effective number of bits), was the 16-bit ADS5560 ADC from Texas Instruments; it measured an impressive 13.5 bits.

 

If most modern cameras, Alexa especially, are using 14-bit ADCs, how are they deriving 14 stops of dynamic range? I read that the Alexa has some dual gain architecture, but how do you simultaneously apply different gain settings to an incoming voltage without distorting the signal? A pretty good read through regarding this technology can be found at this Andor Technology Learning Academy article. Call me a little skeptical if you will.

 

Not to pick on RED, but for the longest time, they advertised the Mysterium-X sensor as having 13.5 stops (by their own testing). Of course, many of the first sensors were used in RED One bodies, which only have 12-bit ADCs. Given that fact, how were they measuring 13.5 in the real world?

 

Now, with respect to linear to log coding, some cameras are opting for this type of conversion before storing the data on memory cards; the Alexa and cameras that use Cineform RAW come to mind. If logarithmic coding is understood to mean that each stop gets an equal number of values, aren't the camera processors (FPGA/ASIC) merely interpolating data like crazy in the low end?

 

Let's compare one 14-stop camera that stores data linearly and one that stores data logarithmically:

 

In a 14-bit ADC camera, the brightest stop is represented by 8192 code values (16383-8192), the next brighest is represented by 4096 code values (8191-4096), and so on and so forth. The darkest stop (-13 below) is only represented by 2 values (1 or 0). That's not a lot of information to work with.

 

Meanwhile, on our other camera, 14-stops would each get ~73 code values (2^10 = 1024 then divided equally by 14) if we assume there is a 14-bit to 10-bit linear-to-log transform. As you can see here, the brighter stops are more efficiently coded because we don't need ~8000 values to see a difference, but the low end gets an excess of code values when there weren't very many to begin with.

 

So I guess my question is, is it better to do straight linear A-to-D coding off the sensor and do logarithmic operations at a later time or is it better to do logarithmic conversion in camera to save bandwidth when recording to memory cards?

 

Panavision's solution, Panalog, can show the relationship between linear light values and logarithmic values after conversion in this graph:

 

panalog.png

 

On a slightly related note, why do digital camera ADCs have a linear response in the first place? Why can't someone engineer one with a logarithmic response to light like film? The closest thing I've read about is the hybrid LINLOG Technology at Photon Focus which seems like a rather hackneyed approach.

 

If any engineers want to hop in here, I'd be much obliged--or if your name is Alan Lasky, Phil Rhodes, or John Sprung; that is, anyone with a history of technical knowledge on display here.

 

Thanks.

Edited by Charles Zuzak
Link to comment
Share on other sites

  • Replies 104
  • Created
  • Last Reply

Top Posters In This Topic

  • Premium Member

Technically, the numbers aspect is above my knowledge range, however I have had an editing studio with wave form and vector scope set up for a long time and I have always been challenged by optimizing the lower end IRE/black range. It's amazing to see detail emerge by simply raising the black level just enough without actually fogging the image.

 

I don't have numbers to back me up but I have always felt that not enough data was being allocated to the zero to 10 IRE range. One can take a signal that has 100 IRE and dial it down to 50 IRE and still have a nice image, but if the lower end blacks are off just one or two IRE, tonal qualities within the black spectrum just disappear.

Link to comment
Share on other sites

  • Premium Member

Then there are issues of what lens magnification are we talking about, and key light versus back light. Contrast values for telephoto are different than for wide angle mode, and then add in if the shot is key lit or backlit.

 

It seems to me that the digital cameras don't do quite as good of a job on back lit, wide angle scenes as film does, especially if there is high contrast in the lighting and in actual colors on the set. But if the shot is zoomed in, the difference in quality between film and video becomes much less noticeable.

Link to comment
Share on other sites

  • Premium Member

 

 

One thing we know from linear A-to-D quantization is that your dynamic range is a function of the number of bits of the converter chip. A 14-bit ADC can store, at best (and ignoring noise for the moment), 14 stops of dynamic range.

 

I'm not quite sure where people get this idea.

 

The dynamic range of the camera is completely unconnected to the bit depth of the ADC. Yes, certainly, the ADC will affect the precision with which the information is stored, and an extremely inadequate ADC might give you less than one code value per stop, but there is nothing stopping anybody quantising 20 stops of dynamic range into an 8-bit image. It might be difficult to use, depending on the gamma function applied, but there is nothing that makes it intrinsically impossible.

 

P

Link to comment
Share on other sites

 

I'm not quite sure where people get this idea.

 

The dynamic range of the camera is completely unconnected to the bit depth of the ADC. Yes, certainly, the ADC will affect the precision with which the information is stored, and an extremely inadequate ADC might give you less than one code value per stop, but there is nothing stopping anybody quantising 20 stops of dynamic range into an 8-bit image. It might be difficult to use, depending on the gamma function applied, but there is nothing that makes it intrinsically impossible.

 

P

If we're doing straight linear light capture, the dynamic range of the camera is limited by the number fo bits in the ADC. If you applied a gamma curve, then you could stretch more stops.

Link to comment
Share on other sites

The camera's sensor captures photons which are mapped at the digital level in a 1:1 function. Hypothetically speaking, if 1024 photons hit the sensor, the camera records a code value of 1024. If half that amount of light hits the sensor, it is recorded as 512.

 

http://provideocoalition.com/aadams/story/log-vs.-raw-the-simple-version

 

Is this a rhetorical game you're playing?

Link to comment
Share on other sites

 

 

 

No.

 

The issue is that no sensor works that way, let alone any actual camera. Curves and manipulation are always part of the issue.

 

Could you be more specific? What does RED do? I've read time after time that they are doing linear light capture and prefer to do all image manipulation in post. After all, that's what lets them get away with "color science" updates years after the footage has been shot.

Link to comment
Share on other sites

  • Premium Member

No sensor has an output that's completely linear to the number of photons that hit it, unless you're willing to consider the additional hardware that's often included on CMOS sensors as "part of the sensor", which you probably shouldn't for the sake of this sort of discussion.

 

But that's not the point. The recording bit depth of a camera does not, in any practical reality, have any effect on the dynamic range of the sensor.

Link to comment
Share on other sites

No sensor has an output that's completely linear to the number of photons that hit it, unless you're willing to consider the additional hardware that's often included on CMOS sensors as "part of the sensor", which you probably shouldn't for the sake of this sort of discussion.

 

But that's not the point. The recording bit depth of a camera does not, in any practical reality, have any effect on the dynamic range of the sensor.

 

What about coding the linear data to log? Aren't you interpolating data in the low IRE end?

Link to comment
Share on other sites

  • Premium Member

Possibly, but this is exactly the sort of thing I mean when I talk about processing happening between sensor data and the recording medium.

 

Ultimately, anybody who's got a 14 bit recording medium and a 15 stop camera, and claims that it isn't enough bits, might just not be using them correctly. Which is where gamma encoding (of which log encoding is a subset) comes in.

 

P

Link to comment
Share on other sites

Possibly, but this is exactly the sort of thing I mean when I talk about processing happening between sensor data and the recording medium.

 

Ultimately, anybody who's got a 14 bit recording medium and a 15 stop camera, and claims that it isn't enough bits, might just not be using them correctly. Which is where gamma encoding (of which log encoding is a subset) comes in.

 

P

 

I already understood that. But RED claims to record linear light data to their SSD mags. All of the looks you can give the camera are just LUT metadata. If the RED One did 12-bit linear recording, wasn't the Mysterium-X chip hampered? (I'm not counting applying a LUT and then recording out the HD-SDI port.)

 

How could you design a high-pass and low-pass filter for the incoming voltage to send off to two different ADCs? Could you have the over 7 stops converted by one ADC and the under 7 stops by another ADC? By my math, each of the "lowest" stops down would be coded with 128 values of precision, which is enough meaningful data if you did a log transform.

Edited by Charles Zuzak
Link to comment
Share on other sites

  • Premium Member

I think you are getting a bit wrapped up in marketing language. What RED is trying to say is that NO particular curve is "baked in" to the image, and that they are recording just the sensor data* which you can manipulate in post.

However, on the REDs, this sensor data is of course compressed which has it's own set of issues-- but in essence they are just saying "we're giving you what the sensor sees"

 

 

*even this sensor data will have it's own "look," however, based on how the sensor is actually designed; down to things such as what type of dyes they're using on their picture elements, circuitry, how deep their "well" is ect...

Link to comment
Share on other sites

Linear light recording means a doubling the number of photons to double the number of stops recorded.

 

An 8-bit ADC means it can record 256 code values (2^8 = 256). 8 stops of light means there are 8 times when the light intensity has doubled. Unless we apply a power gamma or log curve prior to quantization, isn't the dynamic range stuck with the bit depth?

Link to comment
Share on other sites

  • Premium Member

That doesn't really answer my question. Isn't the light hitting the sensor converted into voltage or some other form of analog electrical current? Then why when it is digitized, you can't assign how much voltage or whatever is converted to bits? The image itself isn't made up of whole discreet stops, the luminance varies in intensity continuously.

Link to comment
Share on other sites

That means manipulating the voltage in the analog realm. I've done some reading up on that and it sounds like it's a noisy operation. Most engineers (I know this is a generalization) recommending doing all image manipulation operations in the digital realm (read: A-to-D quantization).

 

Of course, if we did do analog operations, then I believe you would be correct and this would be moot.

 

However, Sony has gone the path of linear light capture as they've touted in the F65 and F5X series. Why do you think they've gone that way?

Link to comment
Share on other sites

  • 2 years later...

No sensor has an output that's completely linear to the number of photons that hit it, unless you're willing to consider the additional hardware that's often included on CMOS sensors as "part of the sensor", which you probably shouldn't for the sake of this sort of discussion.

 

But that's not the point. The recording bit depth of a camera does not, in any practical reality, have any effect on the dynamic range of the sensor.

 

(let me dig some old topics)

I disagree and would go as far as to say that all camera sensors respond in a linear way to the photons. And so yes, any camera can't have more stops of dynamic range than its bitrate at the ADC level.

exemple for 10bit :

"C "(clipping stop) : 1023 (recorded bit value)

C-1 : 512

C-2 : 256

C-3 : 128

C-4 : 64

C-5 : 32

C-6 : 16

C-7 : 8

C-8 : 4

C-9 : 2

C-10 : 1

 

The last stops being quiet unusable (only a couple code values for each stops), you can say DR is always inferior to the bitrate of the camera.

Alexa hase 14 stops on 12bit ? No Alexa records two times at 14bit creating a 16bit file before being downscaled to 12bit.

 

Theses numbers also explain "why the log format"

On a log (stops of light) graph, the numbers above create a 2^x curve (each number being the double as the one before it)

And log(2^x) = a*x so you're back with a linear curve witch makes the file easier to look at (that's why arriraw files are always displayed with a logC curve).

Edited by Tom Yanowitz
Link to comment
Share on other sites

We can use any number of bits to encode any range of light. For example we could have a camera that is capable of measuring a light range of 100 stops, and yet encode that range in terms of a 1 bit image. For example, the following encodes a greyscale image (of unknown range) with only a 1 bit image (each pixel is only black or white).

 

SW-03.gif

 

 

When I measure the light coming off my computer screen with a light meter I measure a total range of 9.5 stops, but there are only 8 bits (0 to 255) being used to drive that range.

 

Perhaps more important is the curve defining the encoding between the brightness of the light and the stored pixel value for such. We note that for the same reason we use a log scale ("stops") to describe light, we'd ideally encode a camera image in the same way, using a log scale. It accords with the way we see. The difference we see between 32 photons and 64 photons, is the same as the difference we see between 128 photons and 256 photons. So for efficiency (optimum file size) we want to encode the image in the same way - as a log scale.

 

Numbers on a computer are internally encoded that way.

 

C

Edited by Carl Looper
Link to comment
Share on other sites

  • Premium Member

At the level of the front-end ADC, assuming a linear sensor, yes. But that's a pretty theoretical situation.

 

CCD sensors are actually quite linear, but few modern cameras use them. CMOS sensors tend to be less so, but the issue is generally masked by the associated electronics and in either case, the absolute linearity of the sensing element is not the point at issue. Effectively all cameras use higher bit depth internally than they record, so what matters is that at some point we have x stops of dynamic range, generally represented in y bits of data. And yes, invariably, at the initial A/D stage, y>x (y is often 16 or 18, and x is often 12 or 13, arguably). The details of how this is handled are only really interesting to the camera designer.

 

But at the recording stage, y<x in many modern cinematography cameras. At this point, however, the image has been significantly gamma processed for exactly this reason. This is not simply downscaling to 12 bit. In a mathematically ideal curve, to use your notation, for a notional 13-stop camera, we would end up with this:

 

C: 1023

C-1: 945

C-2: 866

C-3: 787

C-4: 708

C-5: 629

C-6: 550

C-7: 471

C-8: 393

C-9: 313

C-10: 234

C-11: 155

C-12: 76

C-13: 0

 

I think we both understand each other, but it's as well to be clear. Very rarely is anything handled in linear light.

 

Edit: Dithering of course is another completely different issue! Temporal dithering, call it error diffusion, is also a factor.

 

 

 

When I measure the light coming off my computer screen with a light meter I measure a total range of 9.5 stops, but there are only 8 bits (0 to 255) being used to drive that range.

 

Yes, because sRGB is nowhere near linear.

 

 

 

P

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Forum Sponsors

BOKEH RENTALS

Film Gears

Metropolis Post

New Pro Video - New and Used Equipment

Visual Products

Gamma Ray Digital Inc

Broadcast Solutions Inc

CineLab

CINELEASE

Cinematography Books and Gear



×
×
  • Create New...