Tuesday, August 25, 2015

The GameWorks Paper Tiger

AMD's GPU market share continues to decrease but recent benchmarks suggest that its cards benefit greatly from Windows 10's DirectX 12.

DirectX 12 is meant to provide low level access to hardware and essentially bypass the driver optimization process that has been one of NVIDIA's key strengths over the past few years. AMD's hardware, going by metrics such as shader and transistor counts, has typically been more powerful than NVIDIA's. At launch, however, AMDs drivers are less optimized which means Radeon cards will often lose out to less powerful GeForce cards at the crucial first rounds of reviews. Over time, the Radeon cards will eventually increase performance concomitant to their hardware capabilities.

Case in point, the R9 290 series which traded blows with the 780 at launch is, based on early DX12 game benchmarks, competing well with the 980TI, the fastest single GPU card available*. NVIDIA, however, need only adjust the pricing of its SKUs to reflect a smaller premium at similar performance levels to compete. After all, most games available are still DX9/10/11 where the best performance - though not necessarily best performance for the price - still comes from NVIDIA.

The following was a post from my old site from 2014. 

Yesterday I was chatting with a fellow geek in Survivor City 1 and he mentioned that PC gamers using AMD cards ought to be worried on account of an NVIDIA development called GameWorks.

From what I understand, NVIDIA’s GameWorks is basically a contract that lets game developers get more performance from NVIDIA cards through special software and collaboration with NVIDIA programmers. The alleged downside is that any optimizations developed in the program may not be shared with NVIDIA’s rival, AMD.
There are many ways to translate this. Forbes blogger Jason Evangelho decided to write about NVIDIA’s program in his clickbait article entitled “Why ‘Watch Dogs’ Is Bad News For AMD Users — And Potentially The Entire PC Gaming Ecosystem” 2
He mainly cites Robert Hallock from AMD marketing 3 and an article by Joel Hruska from the website ExtremeTech. 4 The gist is that NVIDIA’s position as market leader allows it to force developers to optimize for NVIDIA cards while legally precluding those optimizations from AMD cards.
The result? AMD cards end up looking unnecessarily worse in game reviews which means AMD’s market share and profitability deteriorate – perhaps to the point of bankruptcy. The PC gaming landscape will be ruled by NVIDIA and consumers will be living under a regime of worse price/performance products.
It’s a compelling strategy at first glance. In fact, it’s so compelling, that just about every company that’s been in a position of market leadership has tried it. The long-term result, however, usually backfires because executives forget the key to business success, i.e., delivering the most value for the customer.

Big Blue: A short case study

IBM introduced their personal computer in 1981 and by 1983 it beat out Apple, Commodore, and a whole host of other companies to became the market leader. 5 The IBM PC not only popularized the term “PC” but was responsible for the success of companies like Intel and Microsoft. As the company that literally created the PC ecosystem, you’d think IBM would still be calling the shots today. IBM left the PC market in 2004 but its personal computing division had been dead for much longer.
Some commentators think that IBM lost because the IBM PC used non-proprietary hardware from Intel and non-proprietary software from Microsoft. 6
I’d argue that it was precisely IBM’s use of non-proprietary hardware and software that gave its PC, and eventually PC-clone makers like Dell and HP, widespread market share. Developing your own OS and processor is extraordinarily expensive. The IBM PC was slower than its chief competitor, the Apple II, but it was cheaper. In today’s dollars, it was $800 cheaper. It was also open which meant that anyone could develop hardware and software for it without licensing or patent worries. After a short while, companies like Compaq figured out ways to provide IBM PC-compatible computers for even less. IBM’s market share plummeted. IBM’s response? Go proprietary.
In 1987, IBM introduced their PS/2 which would, if all went according to plan, use IBM’s OS/2 and AIX operating systems along with their proprietary Microchannel (MCA) bus. The mass of companies that crystallized to develop hardware for the IBM PC’s open ISA bus would now be forced to pay license fees for MCA. In IBM’s mind, If people didn’t buy IBM PCs, at least IBM would get a cut through MCA.
In response, IBM’s competitors created their own alternatives without license fees. Those buses, EISA and VESA Local Bus, not only enjoyed far greater success; their the modern day descendant, PCI, enjoys unquestioned supremacy today. EISA and VESA Local Bus weren’t necessarily better than MCA, but they were good enough and most importantly, cheaper.
It’s the same dynamic that led to VHS’s triumph over Sony’s proprietary Beta format and USB’s dominance over Apple’s proprietary Firewire. 7
NVIDIA seems determined to repeat IBM’s misstep here, though not in GameWorks, but in G-Sync. G-Sync is a serious advance in graphics technology that essentially offers lag-free V-Sync. It’s proprietary and will be available through licensing fees. AMD’s response, FreeSync, requires no licensing fees and will be part of upcoming monitor display standards. Is there any question which implementation will prevail? 8

GameWorks: Glide Strikes Back

As for GameWorks 9, the tech industry provides a particularly trenchant cautionary tale in the name of GLide
3dfx, the company that developed Glide almost single-handedly created the whole market for gaming cards with the introduction of the Voodoo card. At the time, it was a revelation. 3D games were fairly new and graphics card companies tended to focus on 2D performance with 3D as an afterthought. 3dfx focused almost entirely on 3D performance and the Voodoohumiliated its competitors. 10 Part of the reason the Voodoo was so much faster was Glide. If you were a game developer, you could extract the most performance from the Voodoo by using Glide, proprietary software routines designed for their graphics cards. Sound familiar?
Within a decade, 3dfx and Glide were extinct.
Glide and GameWorks offer(ed) real advantages. Games could work with the video card directly and harness its full capability rather than working through an interpreter. It’s the same reason that consoles are able to outperform PCs with equivalent hardware. So why did this approach live on in consoles but was left interred with 3dfx’s bones?
The refrain should be familiar now. What happened was 3dfx’s competitors rallied around Microsoft’s DirectX approach. DirectX is a set of standards that allow games to work closely with hardware. These standards are created through collaboration with video card companies and game developers. With DirectX, game developers have an idea what sort of graphics technologies most PC gamers are able to use and video card manufacturers know which technologies to optimize their hardware for and perhaps which technologies to advocate for inclusion in the next DirectX version. As DirectX wasn’t tied to any particular set of graphics hardware, unlike Glide, it would allow for similar performance at lower cost.
However, unlike DirectX, Glide was available immediately. No need to wait for the standard to be finalized and for video card manufacturers to go through the process of designing and producing cards adhering to the standard. Glide pretty much guaranteed 3dfx the gaming crowd’s dollars until DirectX cards started shipping. It was a textbook case of sustained economic advantage.
The arrival of DirectX wasn’t unmitigated good for the consumer though. One downside to the DirectX approach is that video card makers are less able to create competitive advantage through features. 11 For example, there’s no point in creating a powerful tessellation accelerator if DirectX doesn’t support tessellation. Instead, video card companies compete on increased performance in standard DirectX technologies, say Transform and Lighting (T&L) performance12, which directly translates to better gameplay since PC gaming is almost exclusively a Windows affair and Windows means DirectX.
But if DirectX development ends up lagging (or is nonexistent), then companies might be see an opportunity to develop new features with corresponding proprietary APIs. That’s what 3dfx did and it took the introduction of NVIDIA’s Riva TNT in 1998, two years after 3dfx’s Voodoo 13, for DirectX cards to catch up. 14 After that, Glide was finished
Does AMD really believe GameWorks will enjoy a fraction of the success that Glide did?
Firstly, the DirectX response to 3dfx’s Glide was basically a crash program starting from the ground up. NVIDIA, with a 65% discrete GPU market-share, is trying to develop an API that will compete with the very mature DirectX which still has a near 100% market-share in PC gaming. In fact, Microsoft has already introduced DirectX 12 which includes the sort of optimizations that GameWorks and AMD’s counterpart Mantle provide. 15
Secondly, even ignoring DirectX 12, when we compare NVIDIA’s relative strength to 3dfx’s, we can see just how unlikely Evangelho and Hruska’s narrative becomes.
  • 3dfx had 80-85% of the market at Voodoo’s peak 16. NVIDIA’s cards represent 65% of the discrete video card market with AMD at 35% 17. 
  • The Voodoo was significantly faster than the Virge and Rage card offerings from S3 and ATI. It was 300% faster in Mechwarrior 2, 400% faster in Shogo, 300% faster in Quake etc.18 NVIDIA’s near-top end card at this time, the $700 GTX 780Ti, is less than 25% faster in theGameWorks optimized title WatchDogs versus AMD’s $600 Radeon 290X. 19
Thirdly, game developers for the Xbox One and PS4, which are essentially AMD computers, won’t use GameWorks at all. The real question is whether they will use AMD’s GameWorks equivalent, Mantle, or Microsoft’s DirectX 12. 20 and they’ve adopted the version of open-source religion that believes that the marketplace is incapable of defending itself from “monopolies”. To be fair, that’s a widespread belief.
  1. This is a location in the game DayZ: Overwatch that contains all kinds of goodies. No one actually survives there very long 
  2. http://www.forbes.com/sites/jasonevangelho/2014/05/26/why-watch-dogs-is-bad-news-for-amd-users-and-potentially-the-entire-pc-gaming-ecosystem/ 
  3. https://twitter.com/Thracks 
  4. http://www.extremetech.com/extreme/173511-nvidias-gameworks-program-usurps-power-from-developers-end-users-and-amd 
  5. Managing Technological Innovation: Competitive Advantage from Change, 310, http://www.amazon.com/gp/search?index=books&linkCode=qs&keywords=9780471225638 
  6. There’s some truth to that considering IBM enjoys success in the high end computing business with their POWER microprocessors and AIX operating system. The PC space is radically different. Home users are willing to tolerate slightly worse performance and reliability if prices are low enough and the experience is substantially the same. Few users buy Xeon processors with ECC RAM let alone ten thousand dollar POWER systems with full RAS capability. The only reason businesses buy POWER, or Intel’s Itanium, is because slightly worse reliability can end up costing millions. Power/Itanium is a bargain in those situations. For home PC users, particularly gamers, it’s not worth it. Personally I think ECC and ZFS provide a great deal of reliability in relation to the marginal cost which is marginal in both the common and economic sense. 
  7. This isn’t to say that open formats always triumph over closed ones e.g. MP3 vs OGG. Windows vs Linux 
  8. G-Sync’s cost is not only in licensing fees e.g. a G-Sync monitor will cost a bit more than an identical monitor without that feature, but currently uses an expensive hardware solution to implement. FreeSync uses technology already widely deployed in laptops. Further, the sorts of systems that will benefit most from G-Sync type technology, i.e., low powered machines that cannot deliver high frame rates, are Intel and AMD based. I don’t want to be harsh on NVIDIA because without G-Sync, lagless V-Sync would either never have come to market or might have been introduced much later 
  9. Not related to the overpriced arcade chain 
  10. In particular, the S3 ViRGE, which was sold because computer makers could put a tick next to “3D accelerator card” for very little cost 
  11. Which approach is better? I’d have to agree with the marketplace and go with DirectX 
  12. T&L became part of DirectX 7.0 (or Direct3D as it might have been called back then) 
  13. http://www.maximumpc.com/article/features/voodoo_geforce_awesome_history_3d_graphics 
  14. A review of the TNT from Tom’s Hardware back in 1998! http://www.tomshardware.com/reviews/nvidia,87.html 
  15. http://www.pcworld.com/article/2109596/directx-12-vs-mantle-comparing-pc-gamings-software-supercharged-future.html 
  16. http://www.techspot.com/article/653-history-of-the-gpu-part-2/ 
  17. As of Q1 2014 http://jonpeddie.com/publications/add-in-board-report/ 
  18. A wonderful walk down memory lane at http://vintage3d.org/ 
  19. http://www.forbes.com/sites/jasonevangelho/2014/05/26/watch-dogs-pc-benchmark-results-multiple-amd-and-nvidia-cards-tested/ 
  20. It’ll be DirectX[/rev]
    So why are some in the tech-press and AMD marketing in hysterics over GameWorks? I think it’s because AMD has positioned itself as some kind of champion of openness21laughable when you look at how much better NVIDIA was at providing information to Linux developers for its cards 
* given that it overclocks better than the Titan X on account of NVIDIA's decision to allow only reference designs for Titans.

Wednesday, August 19, 2015

Brain in a Vat

The above is an image of an alleged brain created through induced pluripotent stem cells, i.e., the ones that don't involve destroying human embryos. My BS meter is kind of high for this story but I can't help but to think of the ramifications.

Ever since I had heard of this method of creating stem cells - or at least the success of a particular Japanese researcher around five years ago - I was mostly happy that scientists could pursue advances in this promising field without the ethical issues involved in embryonic stem cell research. It might be slower and with opportunity costs, but medical experimentation has had a problematic enough history that I think it best to err on the side of least destruction.

However, recreating the nervous system is a bit unnerving. (Pun intended, but hopefully not as predictable as the partisan comments flooding the news story)

The brain is a special organ, primus inter pares. You can replace any other organ and still retain your basic identity. We can say that John got a kidney or heart transplant from Jim and we understand that John is still John. But, if it were even possible, we would not say that John got a brain transplant; rather, we would describe it as Jim living in John's body.

There is a special relationship between personhood, identity, and the brain. People with brain damage are often described as a completely different person whereas there is no question of identity in the case of, say, heart or lung damage.

So this raises a number of questions. Is this primitive nervous system conscious? If it is, then is it a person? From the press release, this pencil eraser sized brain is roughly equivalent to a fetal brain at 5 weeks and cannot develop further without a suitable cardiovascular support system.

If consciousness is an emergent property, then at what point does it emerge during development? The lead researchers and some commentators suggest that it could not be conscious on account of the following reasons:

1) It has not received any sensory input
2) There is no electrical activity
3) It does not consume glucose (??)

1) Perhaps, though the existence of an optic stalk suggests that background noise could provide some kind of sensory input. There would be nothing to make sense of, so to speak. Perhaps existence at this point is like those dreamless portions of sleep. A kind of living death. Some comments suggested the fact that people in sensory deprivation tanks do have conscious experiences despite the lack of sensory input but fail to mention that those people have already had plenty of sensory input. There is plenty for the mind to think about.

Nihil est in intellectu quod non prius in sensu. Nothing is in the mind that is not first in the senses. But is sensory input a necessary and/or sufficient condition for consciousness?

2) Similarly, is electrical activity a necessary condition for consciousness in nervous systems? fMRI scans wont show anything, but if there is neurochemical activity going on, then there is some electrical activity as well. Well, in any living system there is going to be significant electrical activity, but whether it is meaningful enough to constitute consciousness is tricky.

Some commentators are quite sure nothing is going on but given the fact that this system came about through stem cells, there had to be quite a bit of growth and an enormous amount of activity going on of a biological type not comparable to the physical activities occurring in, say, bricks. That doesn't stop commenters though.

How certain can you be that a brick is not conscious? Everything we call "conscious" has means of being conscious of its environment - through input and processing of sensory data. If there's no input or processing of data - which appears to be the case for a brick or lump of tissue - then how could it be conscious? I think it's much more difficult to prove that computers and smartphones aren't conscious because they DO input and process data.
Yet if it is not processing information, then it might be fair to assume there is only potentially a consciousness. If "Zafod" argues that it's very difficult to prove that computers aren't conscious on the mere basis of accepting and processing an input, then any machine, from a pulley to a punch card loom might be considered conscious. It's absurd, but then again, it's the comment section.

3) One commentator suggested that this system does not consume glucose and produce ATP.

Brains can't think without ATP. The brain is not consuming glucose. Glucose is necessary for production of ATP. It's done. I've refuted the idea that this brain can think. There is nothing you can say to refute what I have said.
Um, cell growth and differentiation - the very process described - require energy, energy supplied through glucose and ATP production. Thankfully "loljahlol" was never one of my biology students.

At some point, even if this particular story ends up being a sham, scientists will likely be able to grow a brain and the infamous brain in a vat philosophical issues might just become real ones.

Thursday, August 13, 2015

About Pureview

It's a little sad that the 2012 Nokia 808 is a better cameraphone for still photography than any of the flagships from Apple, Samsung, HTC, Sony, etc. The 808 is a bit of a specialized beast in that it has a relatively large sensor coupled with an extremely high pixel count. In use it's a lot like using a medium format camera with ISO 25 slide film, i.e., relatively low exposure latitude and very fine detail in certain conditions. The sensor is still small and noisy even at its lowest ISO 50 setting but it isn't uselessly noisy. It can resolve over 18MP (38MP nominal) in good conditions which is higher than most dSLRs. The Galaxy S5 resolves about 6.5MP (16MP nominal).

You aren't using the system at its best if you use the PureView modes. This is something I've argued in the past but I thought I'd post some evidence to back up my claim. Here are tripod mounted crops at ISO 50 at various settings. First off is at 38MP taken with -5 sharpness, ultrafine, 4:3, +0.7 EV. 

38MP original

I added sharpness afterwards because sharpness filters don't actually add in real detail. They do however introduce noise which is information that JPG compression has to allocate bits for. These are bits that are taken from actual image detail. All images are displayed in lossless PNG.

38MP Unsharp Filter

The Pureview image is simply not as sharp, even after applying an unsharp filter. At first I thought it might be focusing error but when downsampling the 38MP file to a quarter the area (9.5MP), we see that the detail levels are comparable. I'd examined other parts and the focal plane is essentially the same in both the Pureview and 38MP image (as close as can be given no manual focusing controls).

8MP Pureview

38MP downsampled to 9.5MP

Applying unsharp helps but the file is much less forgiving than the 38MP file. Increasing sharpness brings out ugly JPG macroblocks as well as aliasing artifacts. Although it is clearly inferior to the 38MP version, the decrease in quality is mild compared to the huge reduction in file size (14MB vs 2.5)

8MP Pureview Unsharp

Automatic mode produced, alongside the default exposure 0.7EV lower than the previous images, a file of around 1MB. Detail clearly suffers.

Is there anything Pureview offers in the signal chain that makes it superior to simple resizing? Not to my eyes. Anyone is welcome to apply their own denoising and sharpening methods, but given the 38MP file's greater tolerance to editing, it is likely the quality delta will only increase.

Given the fact that a 64GB microSD card is only $25 and can hold well over 4000 images at the full quality setting, there's little reason to use the PureView modes for still photography. 

And just to drive the point home: Pureview vs the original - after applying sharpening to both.

Tuesday, August 11, 2015

When 1% Is Too Much

Came across an article today about how continuous small improvement can yield large gains over time. There's much to recommend it, but the former math teacher/business student part of me winces at the suggested rate of 1% per day.


Well what does this work out to?
  • A rather unfit person starting with a 20 minute mile would end up breaking the world record in under six months.
  • The typical American earning $137 per day (based on $50,000 annually) would have over $27 Billion at the end of four years and be earning $10 Billion per day at the end of the fifth.
  • Someone currently benching 100 pounds would be lifting 3,741 pounds after a year. Not quite the fabled fifty times his own weight that the ant manages, but impossible nonetheless.
  • The typical chess beginner (elo 800) would be an International Grandmaster after four months, and the highest rated player in history - human or computer - after five.
  • Even the semiconductor industry, the paragon of exponential growth, never enjoyed a sustained 1% daily improvement. If it had, starting from the 386 in 1989 ... by 1995 each processor would be more powerful than the most powerful supercomputer that exists today.
For any metric, a 1% daily improvement rate implies a 37x improvement after a year. This might be sustainable for the steep portion of the learning curve but it quickly becomes unreasonable.

Thursday, August 6, 2015

How to Stream on a Potato

There are main factors that determine Twitch stream quality: bitrate, resolution, encoding quality, and frames per second. Each of these has upper limits and in theory you get the best quality by maximizing each one. In practice, it's a more complicated balancing act. The Twitch recommended settings and the settings suggested by the OBS Estimator are okay but definitely not optimal.

Streamers, Know Your Limits


The first step is to find your system limits. For bitrate run the upload speedtest at Testmy.net. [1] Record that figure. For more accurate results, you can run the test multiple times and record the lowest figure and/or average.

Here's my connection. The average is 13.4Mbps up but the important figure is the lowest connection speed which was 7.56Mbps. 

Twitch recommends a maximum upload bitrate of 3500kbps (3.5Mbps) so if your minimum result from Testmy.net is higher than that, congratulations, set your bitrate to 3500kbps and move on to the next section. You can set it even higher but Twitch's ingest servers can only accept a maximum of around 8000kbps and setting it higher than 3500 is allegedly considered abuse. I personally stream higher than 3500 but there's a risk of getting shut down by Twitch.

If your minimum bitrate is below 3.5Mbps, say 2.7Mbps, set your OBS bitrate to slightly under that, e.g. 2.5Mbps (2500kbps).


The limit is the resolution of your monitor. You can usually find out this information by right clicking on your desktop and examining the display settings but I recommend using Speccy by the company Piriform. It's free and provides all your relevant computer specifications in a very easy way. Your resolution can be found in the graphics section. If you game at below your monitor resolution for performance reasons, then that is your resolution limit. Most of the time, however, monitor and game resolution are the same.

Here's my Speccy. You can see my resolution is 1920x1080

At this point in time, given that 1080p is the most common resolution, and given that the video window in Twitch while having chat sidebar viewable is only around 720p, and given that Twitch only allows a 3500kbps bitrate, the highest streaming resolution I'd recommend - no matter your system - is 720p. For people gaming at 1080p, this means an OBS downscale of 1.5, but if you are gaming at some other resolution, you will want to adjust the downscale to the one closest to 1280x720. [2]

Encoding quality 

This is the trickiest one because it depends on the game and your CPU.
The first thing to do to get useful measurements is to set your computer to high performance mode. In Windows, press the start button and search for "power options". Set it to "high performance" which might only be viewable by clicking "show additional plans". Remember to set performance back to balanced after you find out your CPU frequency
The next thing to launch Speccy and click on the CPU tab at left. Record the model, number of cores, and the frequency they are running at towards the bottom. [3]
For Intel processors my starting recommendations are:
8 cores+ @ 4GHz or higher, slow
Haswell or later, 6 cores @ 3.7GHz or higher, medium 
Sandy or Ivy 6 cores @ 4.5GHz or higher, medium 
Older 6 cores or 6 cores running at lower frequencies, fast 
Haswell or later 4 cores @ 4.5GHz or higher, faster 
4 cores that do not meet the above, veryfast 
2 cores, Quicksync/VCE/NVENC or veryfast with higher downscale
For AMD processors my starting recommendations are:
8 core FX series, 6 core Phenom, faster 
6 core FX series, 4 core Phenom, veryfast 
4 core FX and under, Quicksync/VCE/NVENC or veryfast with higher downscale

Frames per second 

This is hard limited by your monitor frequency which can be found next to the resolution of your monitor in Speccy. It's not really important since nearly all monitors built within the past decade are 60Hz or greater and Twitch and Flash player don't work well past 60fps. In-game frame rate can impose additional limits. To check your in game frame rate, you can install FRAPS, or for Steam games use the FPS counter found under settings > in-game and set fps counter to on. I prefer using the RivaTuner Statistics Server (RTSS) which can be modified to provide additional useful information but it's more complicated to setup.

Here's a view of FRAPS. While playing the game you would hit F12 to see your in-game fps.

Nearly all monitors in use today are 60Hz which means that setting video fps higher than 60 is pointless. If your in-game fps is higher than 60, set streaming video fps to 60. If it is lower, then set streaming fps accordingly. This also means that you will want to change the video fps on a per-game basis for best quality. 
For instance, I get triple digit framerates in Killing Floor 2 so I set video fps to 60 in that case. However in ARK I get around a 45-50fps minimum so I set my video fps to 50. In ArmA the minimum framerate varies from around the 20s in large cities to over 60 elsewhere. You'll have to decide for yourself but in general, I would err on the side of more frame rate. In that link you can see the difference between 30 and 60fps. Some members of the OBS forums believe there isn't a big difference, especially for streaming, but they are wrong. Even if you can't manage 60, even 40fps is a 33% improvement in smoothness over 30. This is also the reason why some streamers enable motion blur, it makes motion look less jittery. Try that if you are unable to maintain a high in-game fps.

Performance Issues 

Preliminary checklist before adjusting stream settings:

  • Close the stream while streaming
  • Make sure your computer is plugged in and power set to "balanced" or "high performance"
  • Reboot your computer and/or router
  • Make sure you are using Game Capture in OBS.
  • Make sure you have selected a fairly nearby ingest server. You can use JTVPing to help find the optimal one or you can just select them manually.
  • If you are using a 2 core processor, turning off multicore rendering can help (or use affinity - to be covered later)
  • Check for thermal throttling with HWiNFO64 though good cooling is always advisable as by default, Intel CPUs will only go into Turbo mode if temperatures are low enough.

You can find out if your encoding settings are too high by examining the OBS log files.
OBS Log files can be found by typing %appdata% in the search box in Windows. Hit enter, then navigate through through roaming > OBS > logs. It might be a good idea to create a shortcut to this folder. Just sayin'
You must close OBS to view the logs for the current session and hit "Ctrl F" to open the "Find" box. Search for late and you will find a section that tells you: frames skipped due to encoder lag, total frames duplicated and number of late frames. If any of these values are above 1%, your stream settings are too high. Tabbing out, messing with in-game settings, etc. can increase these values so testing multiple runs of at least 5 minutes is a good idea.
If you started and stopped your stream multiple times, there will be multiple sections showing duplicated/late frame information. There are time stamps to help you identify each particular section. 

  1. Lower your encoding quality one step but do not lower it below veryfast. If you are at veryfast or using Quicksync/VCE/NVENC then
  2. If you are using Quicksync, under Advanced options, change the Preset to lower quality (with 1 being the highest). If you are at the lowest quality Preset then
  3. Lower your streaming fps gradually. Personally I think 30fps is a minimum and if you have it set there then,
  4. Increase your resolution downscale an additional step (so from 1.5 to 2 for example). If you are at the maximum 3x downscale then
  5. Decrease in-game resolution, so if you are playing at 1080p, try playing at 720p. If it is smooth, try increasing quality gradually through reducing resolution downscale 

End Notes

[1] The reason I prefer Testmy.net over Speedtest.net is that Testmy.net is single threaded which is a more accurate representation of actual upload speed for Twitch.

Here are my results which you can see ranges from a low of 7.4 up to 23.7Mbps. My service is plan is 35Mbps upload but you can see the average is less than half that. However if I use Speedtest, my upload is reported as the full 35Mbps. To avoid dropped frames, I should set my maximum upload to 7.4Mbps. However, as Twitch recommends 3.5Mbps maximum, I can feel confident that if I set it to 3.5Mbps, I will not drop frames because of my connection.

[2] There are a handful of streamers with very powerful setups who stream at 1080p60 medium/slower which seems like the ultimate quality but it is not. ipengineer78 and Edgar stream at 6Mbps, and Widgitybear even less. Motion suffers from horrible artifacting. At these low bitrates, artifacting (macroblocking, smearing, and quality pumping) are very noticeable. They are significantly reduced at 720p.

When Twitch allows 20Mbps streams, 1080p60 medium/slow will offer better quality, even for viewers watching in a 720p window, because of detail and chroma benefits from downscaling. But until then, 720p60 is the gold standard.

1080p60 @ 6Mbps and any compression setting will have very good per pixel sharpness when there is no motion, however. This renders things like HUD detail exactly which can make it seem like you are playing the game and is great for screenshots. During high motion scenes, this illusion gets shattered and personally I think it better to have 100% of your frames at good quality than 95% of your frames at mediocre quality and 5% of frames at high quality.

The following is a sample of Edgar's stream who runs at 1080p60 medium/slower @ 6Mbps or so.

His game name tag is very crisp; certainly more crisp than 720p would be. Would you say this is in HD quality? No. 720p would be a large improvement, but even on the slow preset it would still be distinguishable from actually playing the game. To achieve the complete illusion of being in game for 1080p60 would require more than just bitrate but HEVC encoding as well, something unachievable even with the ten thousand dollar dedicated dual-Xeon setups the previously mentioned streamers use. Don't get me wrong, their streams are high quality, but 1080p60 is not optimal.

[3] Games will use one or two cores which drastically affects processing power available for video transcoding.

Essentially an i3 and most laptops have only one core available for transcoding which means hardware encoding (Quicksync, VCE, NVENC) is the best option here. A typical desktop i5/i7 has three cores available for encoding which represents 200% higher performance. This is on top of the frequency advantage desktops enjoy which can be substantial over power saving parts.

It is simply not enough to say that an i5 is enough for veryfast 720p60. Someone I know streaming an FPS (CSGO) on an i5-4200U can only do it at 306p30 and Savage Lands at 256p30, even with Haswell's speedier version of Quicksync. Changing from Preset 4 to Preset 7 worked and allowed 512p30 in CSGO. Savage Lands even at 256p30 and Preset 7 fails.

I'll try to revise the frequency recommendations occasionally since new x264 releases bring better performance and ability to take advantage of newer instruction sets. In theory, x264 or Handbrake benchmarks should reflect the type of work OBS is doing but they aren't quite the same. Haswell and later chips are better suited for streaming, ceteris paribus, on account of AVX2 instructions and that is not reflected in those benchmarks.

Revision History

August 6 2015 - Initial post
August 7 2015 - Added Quicksync/fps specific info for high encoding problems

If you are using Quicksync, under Advanced options, change the Preset to lower quality (with 1 being the highest). If you are at the lowest quality Preset then

Added fps reduction as a step to increase streaming performance

Lower your streaming fps gradually. Personally I think 30fps is a minimum and if you have it set there then,

Recommendation for cooling on account of throttling/Turbo parameters

Sep 1 2015 - Can confirm that at 3.2GHz with hex core, 720p60 fast works but medium skips around 15% of frames

Intel's Skylake or: How to Boil Frogs Alive

Writing about ephemera is kind of depressing. It's the permanent things, or at least, the big changes, that produce a sense of meaning. A hundred years from now, people will still be picking up Marcus Aurelius' Meditations or St. Augustine's Confessions in search of meaning. Hard to imagine anyone reading through a Linux digest flamewar of what options for the graphics driver for the Nvidia GeForce GT230 will produce the best frame rates for Starfox or whatever it is Linux gamers play.

Sometimes you just have to write something down even if it is never again given life by another reader because it could be. A thought written down can never be truly forgotten. Stick with your wife.

Intel's Skylake is officially out. It's largely the same story it has been since Sandy Bridge Bloomfield. Single digit IPC improvement over Broadwell which has single digit IPC improvement over Haswell which has single digit IPC improvement over Ivy Bridge which has single digit IPC improvement over Sandy Bridge which has single digit IPC improvement over Clarkdale which has single digit IPC improvement over Wolfdale which has single digit IPC improvement over Conroe which has significant double digit IPC improvement over Prescott. 

This is why every single Intel CPU review published since Sandy Bridge has a comment along the lines of "Disappointing. I guess I'll be sticking with my overclocked i5-2500k". If you read the artlessly struck out text, you might be wondering why Sandy Bridge is the hero when Conroe saw the largest generational leap in performance. The prior Core series was legendary, in particular the Q6600. But 2500k could overclock to 4.5GHz easily whereas the Q6600 topped out around 3.5GHz. A 30% clockspeed advantage multiplied by several generations of small IPC improvements, a gaming environment with less extreme GPU bottlenecking, and a bargain basement price created the legend.

Imagining a similar jump, I can guarantee you that an i5/i7 that could easily overclock to 5.8GHz today would put to rest any complaints about the languid pace of CPU improvements.

Instead, we have yet another quadcore 4.x GHz i5/i7.

The IPC gains, even with the optimistic assumption that users spend a good chunk of their time rendering 3D cinematics and transcoding video, are about 25% from Sandy Bridge to Skylake. Most people buying high performance computers these days are gamers, and in most games they will realize single digit gains if any. This isn't Intel's fault since games are mostly limited by graphics cards these. Intel did its part and CPU performance hasn't been a bottleneck for years. Even in unique cases like ArmA which are CPU bound, the increase is probably 15% tops since a large portion of the improvements have been in circuitry designed for specialized workloads, i.e., AVX which ArmA and most games do not use.

My own test with a Sandy Bridge i7 overclocked to 4.6GHz in DayZ running around Cherno is a minimum FPS of about 25. Switching to Skylake will require a new motherboard and RAM in addition to a new CPU/cooler. It's really not that expensive to upgrade, even if I were to upgrade to Skylake E from my current Sandy Bridge E system, but all that money and effort for an increase of 4 frames per second?

On the other hand, early indications, just like the early indications were for Ivy and Haswell are that Skylake will be a good overclocker. Non Ivy-E and Haswell-E were plagued by Intel cheaping out on thermal interface material and a focus on power savings which led to rather poor overclocks and essentially no improvement over a comparably overclocked Sandy Bridge system. It is only with Ivy Bridge E, Haswell E, and Devil's Canyon processors that one could experience actual performance gains in OCed systems.

While the high TDPs for Skylake suggest a renewed focus on desktop performance, it might be the case that even with a very good watercooling setup, the chip just won't overclock that high. On the graphics card side of things, moving to watercooling from aftermarket air has not helped clocks for AMD's Fiji or Fury. Nvidia's Titan X/980Ti gains mere single digit percentage improvements.

The thermal limitation might be there, but a significant realized improvement will likely be relegated to sub-zero cooling. Perhaps cascading heat pipes* to a water loop could manage it in an ATX form factor, but even if such a wonder cooler were developed, other chips could also benefit bringing us back to looking at IPC.

In the end, I think these tiny IPC improvements with OCs in the mid to high 4GHz range are all that's left in the cards and years from now when the i5-9500k is reviewed, someone will bring up the 2500k. By that time it will have become something of a joke; the "can it run Crysis?" of its time, though, barring a quantum leap in GPU performance, it will ring true. DX12 and VR only exacerbate the current GPU bottleneck which leaves Intel with very little incentive to abandon its mobile/power saving/marginal improvement roadmap.

These minutiae, these infinitesimal increments to improve something in a vanishingly small arena, however, is how we got to where we are today. For all the geopolitics and academic fighting between Capitalism and Communism, the difference between them from a per-capita GDP perspective is about 3.61%.

Put that way, the entire Cold War and socialist upheavals seem misplaced, but over the course of a lifetime, it matters. If you took two kids who grew up in the ruins of Germany after the Second World War, one in West Germany and the other in East Germany (with its cooler anthem), the one in the West might be contemplating whether he should get a BMW, Mercedes, Audi, or VW, which models he would like, and inspect dozens of cars at several dealerships to decide.

The one in the East would probably still be on a waitlist for a car made of plastic and cotton.

* Just to be clear, I mean cooling the heat pipes with water not using multiple heat pipes to somehow defy the laws of thermodynamics.