Video without Pixels - the debate

Written by David Shapton

Celia Hill/RedSharkVideo without pixels

 At the beginning of December, we published this, highly speculative, article, about how pixels might eventually be replaced by something better. When? Quite possibly before 8K becomes the norm. It made quite a stir on the internet, bringing record traffic to RedShark. Here's our follow-up to some of the questions raised

The article attracted a lot of readers, but, after a few hours, the numbers shot up rapidly. Someone had posted a link to the article on Reddit. This prompted a massive flow of new visitors to RedShark. We had our biggest day ever.

If you haven't heard of  Reddit (and the world is divided between those who have and those who haven't) it's a place where people post links to their favourite articles, and readers upvote or downvote them in real time. It's a very good way to see what's creating a buzz on the internet.

There's a busy comments system as well. In fact, Reddit very largely is the comments, with long and meandering threads quite often containing absolute gems of wisdom and a fair dose of abuse as well.

If you want to have a look at the original thread, it's here. But meanwhile, I've taken some of the more interesting posts and replied to them here. They weren't all positive! 

(Reddit questions in Blue, our answers in Black)

He's only talking about abolishing pixels in codecs. It's a bad title, I agree. I was expecting something about vector-based displays, honestly.

Yes, I was only talking about codecs here, but it follows that sensor and display technology should be considered as well. It's not obvious how this could be done but as computer power continues to increase exponentially (through parallelism mainly - i.e. multi-processors) other techniques which might have seemed impossible will suddenly come into view.

For example, one Reddit commenter suggested Neural Nets. After all, this is essentially how we make sense of the world, so why shouldn't a camera?

The sheer number of vectors you'd have to have to recreate a reasonably detailed image would have to be immense. And it seems unlikely that they'd be substantially smaller than video we have now.

Also, would it not require all-new sensors in digital video cameras? Because current hardware is all made with raster video in mind. That would be ridiculously expensive to replace the equipment, if it was even possible to make "vector sensors" for cameras.

Actually, the number of vectors you'd need doesn't really matter - although there will be real-world restrictions on the ultimate number. This was never exclusively an exercise to reduce bandwidth - but that would sometimes occur. Instead, it's an attempt to break away from the restrictions that pixels place on us when we try to represent the world. On the other hand, pixels give us great freedom too, because, without having to think too hard about it, we can digitise and reproduce virtually anything with relative ease (except for certain types of pullovers).

It's just the classic problem of spending resources on computation or storage/transmission. I personally doubt vectorizing recorded video will ever really be worth it at a frame-level, where you are looking at the 2D image and computing the vectors to represent it - but the trade-off there is the computational cost of building the vector models at the source, and interpreting and displaying them at the destination, vs the storage and transmission cost of the raster video data at high resolutions.

BUT if you don't consider the vectors as just modelling the 2D data though, things can be much more efficient - every 3D game today is effectively vectorized video, but with the vectors modelling at the scene level instead of at the frame level - instead of looking at a frame of a complex battle-scene and computing all the vectors needed to show all the shading and the detail, you more intelligently model each object in the scene, texture them (which is still a mostly raster process), model the lighting, and then just ship that whole vector model of the scene and its animations. That can be done pretty efficiently, and rendered pretty efficiently but GPUs that are only getting better at it - however the drawback is that you need a human to do the modelling, it's not something computers are at all good at doing.

There are a lot of good points in this answer. One thing I didn't make clear in the original article - but I did imply it - is that I was never talking about vectorising individual frames. The whole idea is that - whether in 2D or 3D, you essentially create a "model" of the scene. Since it's a model - it's a thing with a known structure and properties - it's much easier to carry information from one frame to another. In fact, frames would be "imposed" on the model at the very last stage - when it needs to be output to a monitor.

And yes, the very big drawback with this method (actually, it would make it impossible) is that someone has to - labour intensively - make the models.

But I think we're seeing the first signs of being able to move beyond that. The clip in this article shows a video process that creates a 3D model. And a moving model is a vector video.

So when we're thinking about the encoding process, yes, absolutely a first stage might be to "auto trace" a conventional raster image (and, in a sense, in a fairly crude way, this is what the Long-GOP codecs do now, when they extract motion vectors) but the logical thing would be to go way beyond that and try to "resynthesize" the outside world. This would give us a 3D model that would be output as a raster video.

Dont forget that technology like Microsoft's Kinnect makes a virtual model of the world when it "sees" what's in the room. There's no reason why the resolution of that process shouldn't be increased, and why the models shouldn't be encoded along with the light information to create a complete moving vector representation of the scene.

This would not be trivial. In fact it would be very hard. But I believe it is possible and that we are moving in that direction.  

Every 3D game is a modern vector game. The models in the scenes are all vector representations, they're just a more efficient way of vector modelling a scene than trying to use vectors to represent the displayed image instead of the using them to represent the physical structure the image is take of. When the game is being played, the vector models are put through a rendering engine to produce a raster image for display on screen.

There is some cheating though, since the textures applied to the models are generally raster, but that's still much more efficient than recording 8k video at high FPS, provided realtime 3D graphics technology keeps improving at the rate it is.

Yes, exactly, but as I said above, this is hard to do with real stuff.

Not only would you need all new sensors in cameras, but it still wouldn't solve the resolution problem. Resolution gives us higher detail. If you shoot in vector, your hardware is still drawing a line for you as to how much detail to capture. As hardware capabilities increase, you will still upgrade the "resolution" of the sensor.

Well, if all you're doing is autotracing raster images, you don't need new sensors. But, as I said above, it would be worth looking into the possibility of a "vector sensor"

The resolution issue is a separate one, but important. No, vector video would not increase real resolution. It might give the impression of increased detail because edges would be sharp (or smooth) without any hint of aliasing, but there would be no additional information. But it is possible to imagine that at times you might get extra resolution with vectors. Let's say the "vectoriser" has correctly found the outline of an airliner. This could be a very close match indeed. In fact, if the guess about the shape of the aircraft is correct, it could be said to be more correct than a pixel-based version, because there would be no aliasing or pixellation whatsoever.

Where it gets tricky is with small, effectively random details. This is hard to deal with and I imagine the problem would mushroom exponentially. I can think of at least two possible answers to this. The first is easy.

Where there is too much detail to encode effectively with vectors, do what the video games do: substitute a bitmap. With games, these bitmaps normally come from a library of textures, and their success rate at looking natural varies. But they can look very good indeed.

With video, you wouldn't use pre-stored textures, you'd switch to live video, but only for those areas that need it.

Most people would think that texture-based in-filling would be unacceptable. It may be; but it may not be, either, and the reason is that this is exactly what our brains do, either when recalling something from memory, or when watching images that are not clear enough.

Think about when you were at school, on a sports field, if you were lucky enough to have one. You can probably recall it pretty vividly. The chances are that you can remember it when it was sunny, and perhaps on a cold winter day when it was muddy. Either way, in your memories, I bet you can "zoom in" to the level of the individual blades of grass.

Now, what is absolutely not happening here is that you have a highly detailed recording of that sports field. No, what you're actually doing is invoking incredibly vague memories, and filling in the details with textures and objects from your memory.

The big question is: do you see pixels in your memories?

Of course you don't.

Better than you might expect I think there's plenty of room for improvement beyond this

It performs worst on fine texturing. A frequency based encoder gains handles texturing much better because if the texture is regular it is easy to encode and if it is irregular it's hard to notice a poor encoding.

Exactly right. At some point, there's no point in encoding detail that's random and irrelevant. You might as well replace it with something that looks like it, but which you already have the code for (forensic video analysts might object to this!).

Vector video is such a bad idea that my head hurts just thinking about it.

Just because your head hurts doesn't make it a bad idea. It's actually a very good idea with manifest advantages over pixels, but it will be very hard to implement.

Not only that. Claude E Shannon proved mathematically in the 1950's why this completely unnecessary:–Shannon_sampling_theorem

This is complex stuff but all of our current digital media is based on it. What it says is that if your sampling frequency is high enough, you won't notice the individual samples, and your digital media will be as good as the original (assuming a "perfect" analogue to digital converter).

There's more to it than that. Your sampling frequency needs to be at least twice that of the highest frequency you want to reproduce. What's more, you have to filter out all frequencies that are more than half of the sampling frequency because, if you don't, you'll get aliasing (think wagon wheels going backwards).

But - importantly here - this only applies for a given resolution. Just to re-iterate, except for certain limited cases, mentioned above, vector video doesn't necessarily give you higher resolution. It was put nicely by one Reddit commenter who said something along the lines of "What do you get when you zoom in?).

But, whatever you do see when you zoom in, you don't see pixels.

Vector video seems like a brilliant idea. Tag the edges, and you have outlines for shaders and lighting effects. Its easy enough to map regions between splines that move over time, which can enable even more cool effects. I'm guessing that there are all kinds of shortcuts available for describing spline paths efficiently, and new kinds of NURB-ish things to be found.

Yes, I think this is right.

Recording equipment would still only be able to record a certain level of detail. Therefore vector-based video would still suffer from upscaling to some extent. Further it is unlikely that vector-based video offers much in the way of benefits for displaying complex pictures, especially since, as the author admits, we'd be sticking with pixel displays.

True. But it would look better than pixellation.

I got to page three and decided it was a waste of time. I'm glad I didn't finish it

You missed the best bit.

There actually already is an 8k

Yes, I know. I have probably stood in front of more 8K video screens than most people reading this. I wasn't denying the existence of 8K. In the title of the article, I'm saying that I don't think 8K will become a broadcast standard. But I could be wrong.

The author does not seem to understand the present technology properly.

That's a little too vague for me to be able to respond.

I agree. The article is complete garbage.

OK. I'm guessing I haven't convinced everyone.

So basically the future is flash animations, but with tons and tons more detail. Interesting concept. I can see it happening for CG video, because that's what CG video is before it's rendered. He may be right, there's probably a point where storing/reading all those bits is harder than simply re-generating them from the CG instructions.

For non-cg, the problem boils down to vectorising (ex: all the frames. And, mind you, having it not look like an animated gif. Maybe it will be possible but it's not an easy problem.

Your last sentence is exactly right.

Nice little fluff piece that has little to no real world validity.

Time, and a lot of research, will tell.

Of course this does not describe a camera or monitor "without pixels". And it confuses basic terms (of course CRTs have pixels. Analog colour instead of digital doesn't change that).

CRTs don't have pixels. A pixel is a digital entity. Even though CRTs have shadow masks, they are still capable of displaying multiple resolutions without one being better than another. With CRTs, there is no way to have a one to one relationship between a pixel in memory and an exact spot on the screen.

We'll be coming back to this subject with another full article, soon. Meanwhile, let us know what you think in the comments.


Tags: Technology


Related Articles

31 July, 2020

This is how Netflix is adapting Anime to modern technology

The streaming service brings 4K and HDR to the classic Japanese artform in the latest example of its prototyping production techniques.


Read Story

30 July, 2020

Gigabyte Aero 17 XA review: A competition beating powerhouse [sponsored]

The Gigabyte Aero 17 XA has some pretty nifty specs on paper. How does it stack up in the real world, and more importantly against the competition?


Read Story

30 July, 2020

For all film makers: How to avoid losing your stuff and where to put it

Replay: The technological revolution has created great opportunities for new film-makers everywhere, but has in its wake created a new challenge:...

Read Story