Wednesday, October 30, 2013

Youtube Sucks for Retrogaming Videos

When it comes to videos showing footage of real retro PC and console games, Youtube and the other major video sharing sites suck.  Here is why :

1.  Standard Definition Frame Rate and Aspect Ratio

Normal TVs output a standard resolution : a 525/480 (output/visible) line resolution interlaced at 29.97Hz for NTSC and a 625/576 line resolution at 25Hz for PAL.  The video is interlaced, with odd lines of a frame being displayed first, then the even lines.  Although the TV may be drawing lines 50-60 times per second, the actual number of frames in video is 25-30 per second.  The horizontal resolution with analog broadcast or recording equipment (tape) could be anywhere from 300-600 color dots per line.  Eventually, by the time of DVD, NTSC and PAL video would be able to display 720/704 horizontal dots, and the 4:3 aspect ratio would constrain the visible pixels to fit within the frame.

When Youtube and other video sharing sites were first being implemented, this was the maximum resolution they supported.  In the beginning, with many amateur filmmakers using whatever kind of video equipment they could find, this was not really a big deal.  Youtube also supported 320x240 and 480x360 resolution modes in addition to a 640x480 resolution mode.

Eventually Youtube added support for 720 and 1080 line modes and 16:9 widescreen modes.  In fact, all videos are shown in the 16:9 video window, and 4:3 content is pillarboxed.  All videos can use be seen in any of the following formats :

192x144*
256x144
320x240*
428x240
480x360*
640x360
640x480*
854x480
960x720*
1280x720
1440x1080*
1920x1080

* - Pillarboxed resolutions

While Youtube describes its resolutions as 720p and 1080p, the maximum frame rate is 30 non-interlaced frames per second.

2.  Home Computer and Console Frame Rate

Every computer and classic console prior to the Sega Genesis and Super Nintendo almost exclusively output a video signal at either 60 frame per second for NTSC systems or 50 frames per second for PAL systems.  Thus the Atari 2600 usually supported a 160x192 resolution, the Apple II, 280x192, the Atari 8-bit,  320x192, the Commodore 64, 320x200, the IBM PC with CGA 640x200, the Colecovision and Sega Master System, 256x192 and the NES, 256x240.   The Sega Genesis and Super Nintendo did support high resolution interlaced graphics, but games stuck with the low resolution modes 99.9% of the time.  Some home computers like the Amiga and add-on cards like the IBM 8514/A did support an interlaced signal, but this was rarely used due to the flicker perceived by the short distance between the user's screen and eyes.  These early consoles and computers traded graphic resolution for frame rate, primarily to reduce distracting flicker.  Handheld consoles like the Gameboy, Sega Game Gear, Nomad, Gameboy Advance all support 60 frames per second.

The IBM PC and compatibles did not vary refresh rate by country but instead by display adapter.  The monochrome and Hercules cards used 50Hz, the CGA, PCjr., Tandy 1000 and EGA adapters 60Hz and the VGA, SVGA and later adapters 60Hz, 70Hz, 72Hz, 75Hz, 85Hz.  When LCDs overtook CRTs in computer display technology, 60Hz to 75Hz was generally deemed sufficient for fluid gameplay, as flicker was no longer an issue.  3D screens in 3D mode use 120Hz because twice as many images are being displayed.

For gaming consoles, only with the Playstation, Nintendo 64 and Sega Saturn did interlaced modes show some significant use, but they did not truly become popular until the Playstation 2.  Most Dreamcast and Gamecube and Xbox games could be played in 480p progressive scan (60 frames per second) through the use of special VGA or component video cables.  A much smaller proportion of games for the Playstation 2 support progressive scan.  The Nintendo Wii supports 480p over component video cables and the Playstation 3 and Xbox 360 generally use 720p for HD games.

3.  The Problems

If you post game footage to Youtube, either taken by a video capture device or an emulator, Youtube is going to convert your video.  Since it will not display native 60 frames per second video, it will convert it to 30 frames per second.  This can be done by dropping every other frame, or by interpolating two or more adjacent frames to make one frame.  However, as either method would cut the running time in half, each frame may be repeated.  This plays havoc with the motion within the game footage.  It looks jumpy as though it was being played on a poor emulator.

The other issue is the scaling algorithms used.  Say I take some emulator footage of a NES game.  The emulator has no filters, no scalers, just pure 256x240 graphics at 60 frames per second.  The pixels are crisp, sharp, clear.  Each pixel is the same size and each color is clearly distinct.  This is the purest form of capture you can get, provided of course you use an accurate emulator like Nintendulator or Nestopia.  The resulting video may be losslessly compressed, but Youtube does not display losslessly compressed video, no matter how small it may be.  DOSBox 0.74 and Yhkwong's SVN has a lossless video codec which it uses to record, and since DOSBox is highly accurate, it is by far easier to capture video from DOS games using it, especially pre-VGA, than from real hardware.

I posted a test video here : [link removed]  This video was originally recorded using Nestopia - Undead Edition 1.45, a very accurate emulator.  I decided to use the 2C03 RGB PPU palette (without the extra grays Nestopia inserts) to simulate as accurately as possible the ideal capture from the real machine using the highest quality output that would be readily available.  The 2C02 PPU, found in every consumer-based NES or Famicom except the Famicom Titler, can only output composite video.

While most of my run through the stage is not too bad, things really start to look wrong when Mega Man fights the boss.  There is a fair amount of flicker on real hardware or the emulator as this part of the game really goes beyond the 64 sprite limit of the NES.  However, the result is nowhere near as bad as the the video makes it out to be, and just ends up distracting.

In addition, my video was pixel perfect, no interpolation shown when viewing it at its native resolution.  Youtube does not offer an "exact size" setting for the video window.  To be fair, 256x240 is really too small to watch unless you have a low resolution monitor.  However, there is no x2 or x3 scaling available.  The smallest window size is 384x360, the larger window size is 512x480 (an ideal 2x) and the full-screen mode is whatever the native resolution of your monitor is with pillarboxing.  All scaling uses typical bilinear or bicubic sampling, so what were sharp pixels appear fuzzy.  Nearest neighbor interpolation is easier computationally, but not available.

If you download an MP4 of the video, and there are many utilities you can use to do this, you can see the video in its native size without interpolation.  It will appear sharp.  However, Youtube's conversion process has done its damage to the frame rate, which as you will now see is 30 fps.  Here you can compare for yourself :

[links removed]

Unfortunately, if you wanted to see my playthrough video at it was meant to be seen, you cannot view it in a browser, you must go to the trouble of downloading it.  And this is with a perfect video capture.  Captures from real hardware frequently look like crap between the capturing device (most not designed for vintage consoles and computers and not designed for pixel-perfect accuracy, if that can be done for analog video) and Youtube's conversion.  I used to host videos of my classic computers on Youtube, but since I only had a cell phone camera (which will take video only at 30fps), the video always looked bad and flickery, something Youtube can do nothing to improve.  To be fair, it was not Youtube's fault, so I took them down.

Nestopia and DOSBox use the ZMBV capture codec to capture video with lossless compression and the accompanying audio.  It is an excellent way to capture 8-bit video, and for NES video, the resulting file size is an average of 7MB per minute.  If you did a two-hour playthrough of a game, the resulting file size would only be 840MB at the native resolution and frame rate.  This is a very reasonable file size, and while Youtube does recognize ZMBV encoded AVI files, it will compress them and the first casualty will be the frame rate.

5 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. I may have to revise this post very soon, because Youtube has just announced support for 48fps and 60fps video. Here are some sample videos :
    https://www.youtube.com/playlist?list=PLbsGxdAPhjv9UrLo19pS8teoRKj7funAy

    Note that you will need to set it to an HD resolution (720p or better) and watch it on the Youtube site, not in an embedded player.

    ReplyDelete
  3. One could also host carefully encoded the videos somewhere, offer direct links to them, which can then be progressively downloaded in a player that supports streaming, like SMPlayer and probably VLC (not using it myself). Google has somehow been given the right to lord over the internet and all video uploads must go through them...

    Two other issues with YouTube are chroma subsampling, which kills nice sharp color sparkles, and, at least until recently, forced deinterlacing of everything. I uploaded a video with ordered dithering. It had applied a vertical filter, which removed all the dithering in vertical direction, but left alone most of the horizontal. The only reason, why this could have been done, is to make handle interlaced footage presentable, because YouTube can't trust that the author does that.

    http://dl.dropboxusercontent.com/u/61700377/screens/youtube-shitty-quality-comparison.png

    Video to the right also has subsampling, and "acceptable" bitrate, but looks much better. YouTube has destroyed even more color and reds that would have been necessary.

    ReplyDelete

  4. Authenticity

    You could always use a real machine or output your emulator's true-resolution signal to an actual CRT television (with the aid of things like VGA2SCART and Soft15kHz, for example), and then film the television screen.

    There is no perfect solution, but this way, you would be able to also present the magic of 'television beauty'. The way beautifully pixelled lores graphics look on a (bright) CRT television, just can't be matched, and should be showed to the world every time a worthwhile game or other "retro" video is shown.

    Kids nowadays don't know what they are missing by using only the 'progressed' technology, the latest, new TFT monitors that completely vomit upon the otherwise beautiful lores graphics.

    Now, this filming solution is usually frowned upon for understandable reasons, but for beautifully pixelled lores graphics, I think it's the best option of all.

    First of all, it retains the proper aspect ratio (SNes and NES, for instance, have a weird resolution that looks awful when just captured from an emulator, pixel by pixel - because the shape of the pixels used to display them is different from what it is supposed to be - everything looks squashed, compared to how it's designed to look).

    So all SNes and NES games look good, because they are fullscreen without being stretched/scaled in any way - just glorious, perfect, authentic resolution filling the television screen.

    Second, it shows how great the TV display really makes the lores graphics look like (though watched in a TFT monitor, it's not quite the same). Too bad you can't really show how much brighter the good CRT televisions are than the TFT monitors, when it comes to these lores graphics.

    Third, it brings the 'geek in the basement' feel to the video, showing an actual television in a room - it captures the 'atmosphere' of the room, and what's happening in it at the very moment.

    If done right, this is the optimal solution - youtube will destroy everything anyway, but if you film in a high resolution, and your camera probably doesn't support 60 fps in such a high resolution anyway, this is the closest that you can get to having a 'browser-streamed lores retro video' look authentic, or at least what I would call "passable".

    ReplyDelete

  5. Oh, and it would eliminate the 'scaling/stretching' problem altogether, as whatever youtube does to it, it's now a HD-video of what happens on a TV screen, and not just 'actual lores pixels at the mercy of youtube re-encoding process'.

    So no matter how much youtube maims it, the main beauty of the video should remain relatively intact.

    ReplyDelete