Jordan's Blog: 2016

Thursday, December 15, 2016

Does Radeon Chill Improve Input Lag?

AnandTech summarizes AMD's rationale in lowering frame rates during periods of inactivity as follows:

During periods of low user interaction or little action on screen the CPU and GPU power limit is reduced, causing the hardware to slow down ... one side-effect of this means that fewer frames are queued in the buffer, which AMD claims it results in a quicker response time from frame generation to frame output.

Reducing CPU power means that any ramp up from the slower state will have to be very fast in order to ensure no latency from increased CPU time. Absent hardware based frequency shifting, which is the case for pre-Skylake Intel CPUs, this can take up to 100ms although the time is typically closer to 20ms. So that's at least on or two frames depending of input lag depending on refresh rate. And aren't the frame queues around 3 at most? Or is that OpenGL only? Or was that VSync and triple buffering?

Time for real world testing.

TechReport did just that and noticed frame times with Chill enabled were occasionally faster than with Chill disabled.

That result does seem to mesh with AMD's claim that Chill can improve responsiveness by keeping more of the GPU available for times when fast rendering in response to user input is needed.

But that might just be part of the story. The real test will require the mice + led switch and high frame rate capture analysis. Paging systema.

Tuesday, December 13, 2016

Ryzen ... and that 6700K 4.5 test

Just watched the AMD Zen stream and of course the most telling information was what was not revealed. There was no word on price, frequencies, or single threaded performance. Not revealing the price is understandable but no information on upper frequencies or single threaded performance is unfortunate.

Intel's foundries are better so it's hard to imagine Ryzen matching the frequencies Intel is able to achieve. However, the 95W TDP for their 8 core part is a good sign. Then again, if that is 95W of AMD TDP, then I'm not so sure. If the scaling is anything like the FX series, then 4.7GHz base is going to be 220W which would be pretty good considering these are 8 full cores with better IPC. Even if it's higher, either way it's watercooling territory.

Single threaded performance is almost certainly worse or else AMD would have been demonstrating a lot more benchmarks apart from Blender where IPC appears about the same. Worst case? Bulldozer scenario where AMD cherry picked a few benchmarks to show parity or superiority to Intel's CPUs but underperformed in the real world.

I don't think this is the case with Zen. The Blender and x264 tests (Handbrake and Twitch) were a really good showing not only over AMDs previous micro-architectures but also versus Intel's latest. Maybe the x264 improvements are just the result of better AVX support and Blender is a one-off? Please no.

But the most controversial moment in the stream was the demo of the 6700K @ 4.5GHz versus Ryzen.

Most comments thought there was some rigging involved since the 6700K stream/gameplay looked like a slideshow. I don't think so. The streaming settings were almost certainly at a higher quality x264 preset or resolution/fps. Each step in x264 quality/resolution/fps requires a large increase in CPU power so it is easy to bog down any given setup this way.*

No one with a 4.5GHz 6700K is streaming a CPU intensive game using 1080p60 @ medium whereas that's probably doable with the 8-core 6900K or Ryzen. So yeah the demo was rigged in the sense that streamers are not going to use settings where gameplay and stream quality are trash but it isn't rigged in the sense that the 6900K or Ryzen are going to be able to offer better quality streams even against the best mainstream Intel CPU.

But that 6700K is still going to offer better gaming framerates when not using x264. And you get the best of both worlds with a dual PC streaming setup anyway. Ryzen would be a great candidate for a streaming PC though.

Anyway, Zen seems to be as or slightly better than expected but not what a lot of us were hoping for. Its success will depend on release prices and process improvements to improve frequencies. AMD needs to offer Ryzen at a much lower price than the 6900K to compensate for its likely worse frequency and somewhat lower likely IPC across most programs. None of that $900 FX9590 nonsense.

* Stream quality increases can be done granularly with fps but less so with resolution and x264 quality.

Thursday, November 24, 2016

Happy Thanksgiving!

If you are reading this, then you probably have a computer or smartphone, electricity, shelter, food, and all kinds of things that make living in the 21st century a lot nicer than most of the other centuries.*

There'd be a lot more to be thankful for, though, if you had your computer or smartphone just twenty years ago when the whole world would have completely dumbfounded by it. In fact, the government probably would have had your computer or smartphone confiscated in the name of national security so I guess you'd be less thankful for that.

In 1996, people still had dialup internet and a power user 0.2GHz Windows machine set you back $5,000 (almost $8,000 in 2016 dollars).

At the time, this got my heart racing

Flipping through the August 1996 issue of Boot magazine shows how far we've come but also a bit of what we've lost. It might have been part of an unsustainable tech bubble brought down by the prosaic realities of the Business Cycle, but at least we got these fevered dreams of the future. Cyberpunk, virtual reality, Skynet. There's little to be thankful about our current asset bubble. I suppose we've traded away some future prosperity to prop up the deteriorating grandeur of McMansion America.

Even then, I think about my great uncle who was a top flight lawyer back in the Philippines. He had a car and a refrigerator back in the 1930s. That was high living even by American standards so you can imagine how impressive it was in the Philippines.** No one brags about owning a refrigerator today.

Even the first generation iPhone obliterates 1996's Dream Machine once you add in speakers, monitor, and input devices. This is why buying high quality peripherals makes more sense than buying faster hardware. Now if only someone would make a good mouse.***

Apart from IT advances, life in 1996 was largely the same and I'd still be thankful for many of the same reasons so it gets old. You're never really thankful until you no longer have those things. And in an age of increasing abundance, the potential of losing it all becomes rarer.

Less than a century ago, around 30% of Americans were farmers. Being thankful for a good harvest was near universal and that sense of privation and bounty were a natural incentive to the formation of good people. Voluntary privation, whether it's fasting or cutting back or even going camping are ways to bring that sort of character building back. There's some sense of that in Survivor or disaster themed shows which, at least personally, is part of their appeal; though, I would never wish disaster on civilization just to make people feel thankful later on – let alone for the economically dubious reason of increasing demand to help the economy. 😆😆😆****

I really, really, really, like the face with tears of joy emoji but Microsoft's version is poorly done (😂??) so here's a superior version I found online.

* Or maybe you don't have any of those things and are a homeless guy reading this at a library but if I were homeless I would be loading up on turkey at the local shelter since Thanksgiving is one of the few days when society shifts focus somewhat toward the less fortunate.

** He bought into the fast life and lost it all.

*** Low input lag, firmware configured settings, durable finish, easy to clean, custom ergonomics, no sensor issues like angle snapping or acceleration, repairability, quiet clicks. As a lefty, my choices are sharply curtailed and I'm using the Logitech G900 after having gone through two faulty Logitech mice and the infuriating Razer software. The G900 hits most marks but: isn't easy to clean, its finish is starting to wear, probably difficult to repair, and sounds like a nail clipper with every click.

**** Yeah, couldn't help taking a swipe at Krugman and his Keynesian buddies but I can't help feeling perversely thankful for the Establishment getting taken down several notches this election year. They haven't given up (and why would they? The odds favor the house at the end) but if the corporate tax rate is reduced to 15%, that will provide favorable conditions for entrepreneurship – the key reason we have all these things to be thankful for in the first place.

Tuesday, November 15, 2016

Methodology done RIGHT

"Pick your battles"

Some battles are a lost cause. Decimate is beyond saving. The historically specific meaning about the Roman practice will probably stick around in specialized discussion, but the proper general meaning of "a 10% loss" isn't coming back.

The same is probably true of methodology and problematic. I only complain about these because I see them used incorrectly all the time. I'm being a hypocrite here because I am a bad writer wont to begin sentences with "and", intentionally use sentence fragments, on top of all kinds of other writing mistakes.

But a tech article this morning got it half right, and that ain't bad. The writer still conflates method and methodology here:

I've been working behind the scenes on a radically new test methodology.

Yet what follows is a discussion of why he believes some methods are better than others, and why a particular method was chosen for his review. In other words, actual methodology. This made me very happy! The article is a review of the Samsung 960 EVO SSD which is of limited interest, but within his methodological discussion is this gem

Analyzing trace captures of live systems revealed *very* low Queue Depth (QD) under even the most demanding power-user scenarios, which means some of these more realistic values are not going to turn in the same high queue depth ‘max’ figures seen in saturation testing. I’ve looked all over, and nothing outside of benchmarks maxes out the queue. Ever. The vast majority of applications never exceed QD=1, and most are not even capable of multi-threaded disk IO. Games typically allocate a single thread for background level loads. For the vast majority of scenarios, the only way to exceed QD=1 is to have multiple applications hitting the disk at the same time, but even then it is less likely that those multiple processes will be completely saturating a read or write thread simultaneously, meaning the SSD is *still* not exceeding QD=1 most of the time.

I will admit to confirmation bias here since I've long believed QD1 to be the most important SSD metric. And until developers start paying attention to multi-threaded disk IO, QD1 will remain important in much the same way CPU frequency and memory latency are generally more important than CPU cores or memory bandwidth.

* Ironically, people who think the incorrect use of decimate should be acceptable like to defend their illogical view by accusing their opponents of employing the rather logical sounding etymological fallacy, which, Wikipedia tells me, is a species of genetic fallacy.

But these sorts of fallacies are informal ones, i.e., undeserving of sharing even part of the certitude that "fallacy" conveys in formal logic.

Wednesday, November 9, 2016

Things to look forward to in 2017

Thankfully the election is over! For the amount of time people, myself included, devote to this thing, the actual impact the president has on most people's lives is very low. Ask yourself how Obama's eight years have affect you personally. Me? I've had to pay fines for not having health insurance. Thanks Obama.

But here are some cool things that will probably have more effect on my life (and maybe yours too if you are a filthy computer nerd) next year than the president ever will! That is, unless the Draft is called; one reason to prefer candidates who want ~~isolation~~ peace.

OLED displays

eBay was running a deal for a 55" 4K OLED TV for $1500 a few days ago. So I think that's a sign of things to come. Back in 2008 I was predicting that OLED prices would quickly drop from the stratospheric $2,500 Sony was asking for an eleven inch OLED TV and dominate the display market by 2010 with cheap, huge, flexible screens.

Boy was I wrong. But these things are finally getting cheap and are the very-best-like-no-one-ever-was of display technologies. The deepest blacks, super low persistence, outstanding viewing angles and colors. TN, IPS, VA, and even Plasma/CRT, have too many compromises that OLED takes care of. OLED might degrade faster than other technologies and use more power in some situations, but it's a small price to pay for picture and motion quality supremacy. As a parenthetical, even though OLED is capable of very high brightness, I'm not sure how well that brightness holds up in low persistence modes, e.g., most strobing monitors have drastically lower brightness in strobe mode although EIZO seems to have figured out a way to mitigate the issue with its Foris monitors.

Overclocking

Intel got a bit of its OC groove back with the Haswell Refresh, but this years Skylake and next year's Kaby Lake look to be a firm step back to good overclocks. The mainstream Haswell i7 averaged about 4.47GHz whereas Refresh and Skylake average 4.65GHz on air.

Kaby Lake is clocking in 4.2GHz stock turbo boost frequencies so I'm really hoping 5GHz chips become a regular feature on Silicon Lottery.

Speaking of which, Silicon Lottery now offers delidding for the i7-E CPUs (which already use a higher quality TIM, i.e., Thermal Interface Material) than the normal i7. The thermal advantage isn't as large as say, delidding a 4770k, but it looks to be nearly 10C. Impressive! I don't know about the price, but removing the solder-based TIM from an enthusiast i7 is a lot more involved. It's still worth it, I think, given how long we are holding on to CPUs these days.

An easier route to get cooler temps? More people are experimenting with cooling the underside of the processor with basic heatpads, heatsinks, and fans. It should be fairly simple and inexpensive while offering several degrees of additional cooling. It makes a lot more sense for motherboard and case manufacturers to accommodate this rather than put features geared towards LN cooling and external radiator setups on power user products.

Alphacool has developed a replacement for the venerable D5 pump which promises to offer the same performance with lower noise and vibration.** Pump noise drives me nuts and is one of the two things I hate about watercooling. Maintenance is the other, though now that the All-In-One (AIO) units are getting really good; if Alphacool creates a premade AIO with this new pump, it'll go into my next build. It might just be on par with the D5 so this sounds like a job for SilentPCReview.

Intel's fancy new Turbo Boost 3 (TB3) seems to be a way to realize gains from per-core overclocking. Right now it's only on their latest i7-E chips, but that's where it makes the most sense. Usually, as you add CPU cores, the top frequency drops because people usually limit their overclocks to the weakest core. This is because it's not only easier to test all the cores at the same time for stability but it's a pain to make sure the different required voltages are supplied properly and make sure that core affinities are set optimally. Even with core affinity automation software like Process Lasso, I read somewhere that a given core assignment doesn't always correspond to the same physical core.

TB3 fixes that by uniquely identifying each core according to their quality and automatically assigning the most intensive threads appropriately. In theory, that means that buying a processor with more cores should mean a higher top frequency because the chances of getting a golden core improve with more cores.

But my experience with Intel's software has been just wait for the next version – if there is a next version. Step in the right direction though.***

Caching Hardware

The Integrated Memory Controller (IMC) on Intel's chips seems to have been refined bigly with Skylake. It's still the silicon lottery with both processor and RAM, but Silicon Lottery also tests IMC strength – for a price.

Speaking of RAM, 2016 saw the widespread adoption of Samsung's high performance B-die in DDR4 sticks. Perfect for the beefier Skylake IMCs (though a quality motherboard is also required to achieve good overclocks). Synergy.

Optane should be hitting the market in 2017 along with appropriate supporting hardware. It's meant to be much faster than SSDs though early benchmarks give me the suspicion that this is might end up RAMBUS-tier. But if it can deliver the low queue depth 4K goods, then sign me up. If not, Samsung's recently released 960 Pro should be on the short list of any 2017 build. Unlike most SSDs, some kind of extra cooling is needed to keep the 960 Pro (and its OEM predecessor SM961) from thermal throttling, something I take perverse delight in.

AMD Zen

Zen is probably not going to be as fast as Intel's best. So 2017 will probably see the prices on Intel and nVIDIA's flagships creep up. Most people don't buy the top of the line CPU or GPU, but competition at the top means fiercer price competition at all levels. If AMD can produce a $1000 10-core CPU that outperforms Intel's $1700 10 core CPU, not only does Intel's 10 core CPU price get slashed to around $1000, but it also means Intel's $1000 8 core CPU gets a price reduction and so on.

Hope springs eternal, but maybe it's a return to Athlon form. Zen, like Athlon, is an elegant architecture name. Note to AMD, names like: Hammer, Bulldozer, Piledriver, and Sledgehammer, don't really make sense if the product performs worse than the competition. I don't know what I'm going on about but I do know what I want. And that is ... for Jim Keller to Make. AMD. Great. Again.^

1080Ti

Pretty sure Jim Keller's pixie dust didn't land on the ATI team so I Vega 10 is probably going to be a bust. It might be a good product but I think nVIDIA has already telegraphed, via locked voltages on its newest cards, an intention to drop a slightly weaker Titan Pascal with whatever clocks are necessary to beat Vega 10. Even if Vega 10 wins in DX12 performance, that (still) won't be relevant in 2017. On the other hand, ATI vs nVIDIA seems to have had a lot more back and forth than AMD vs Intel.

VR

All indications are that Sony's PS4 VR is a great product and the Chinese are going to have their Vive/Rift clones out soon which will be good for everyone. The physical and spatial aspects that VR adds could be a powerful adjunct to meditation and learning. It's the Wild West out here and it's wonderful.

End notes:

** Aquarium filter manufacturers are you listening?

*** Intel has this legendary reputation in hardware that makes its software shortcomings hysterical. On reflection, Intel hasn't been infallible at all but there's this kind of reverence computer people have toward the company. Like pre-2008 Greenspan.

^ I know he left but writing "Jim Keller to have 'Made AMD Great Again'" is awkward

Tuesday, November 8, 2016

Gaming the vote

A few weeks after I send in my ballot, the election commission sends me a letter asking me to update my signature. It's apparently changed from the time I registered. The only time I'd bother to update that signature is in the Nozickian situation where my vote actually did count, i.e., I am the tie-breaker.

Statistically, it's unlikely.

But is the statistical unlikelihood of being the tie-breaker a mathematically compelling argument against voting? I have my doubts. Everyone else is a potential tie-breaker and so the correct strategy is for no one to vote. But as the number of voters go down – assuming a homogeneous reduction – the likelihood of being the tie-breaker goes up.

What about a heterogeneous reduction in the numbers of voters? If it wasn't obvious, the decision of whether to vote has some Prisoner's Dilemma type aspects. Deciding to vote carries with it a burden of having to pay at least some attention to political ads, arguments, and all the other things that make the election season wearying.

Imagine two Nozickian voters on opposite sides, Red versus Blue. If Red votes but Blue does not, Red wins and vice-versa. If both vote, it's a tie but both will have wasted time and money in the process. If neither vote, neither will have wasted time and money. The stable "intelligent" choice is to not vote. But if the policy differences appear large enough, the payoff for coöperating becomes insignificant.

If this election's presidential contest were between The Apprentice Trump and First Lady Clinton, no one would vote. But when it's between Literally Hitler Trump and In Your Heart You Know She Might Clinton, the payoff is tantamount to saving the world. We saw this earlier with Brexit where I could not understand the hysteria over a largely inconsequential vote.

Although Trump is definitely the non-establishment protest figure in the election (and the candidate I would expect to win), neither offer anything radically different*. So the stable intelligent choice is still to not vote. However, if there is a meaningful difference in intelligence between Red and Blue, then the stable intelligent choice becomes unlikely because less intelligent people are less coöperative. Thus, given unequally intelligent players, the correct strategy is to vote even if the outcome is almost guaranteed to be less than optimal.

* Principled to the point of consistency. The only believable differences are a stricter immigration policy and lower corporate taxes on Trump's part.

Monday, October 31, 2016

Poetry

Completely unintentional but when I noticed, I just had to take a screenshot. I figured today would be a good day to share it. R U SPOOKED?

Wednesday, October 26, 2016

Tired are your splendid soldiers – To the future turned, we stand!

Just finished the third season of Black Mirror, a hip, smart, and modern* take on the Twilight Zone/Outer Limits. American adaptations of foreign programs often lose a lot in translation, but Netflix has done a great job of keeping the general feel of the British originals, e.g., minimalist pastel aesthetics, slow-motion-backed-by-ostinato-soundtrack sequences, etc.** while integrating certain themes unique to the American milieu.

It's got technological alienation and dehumanization, The Singularity, emulated minds, big data, social media, law, politics, AI, robotics, augmented/virtual reality! It fills a niche that has been empty since Star Trek: The Next Generation (sorry, Fringe doesn't count). It's all there. Almost.

However, the dystopian outlook towards technology is predictable. This series really could have used input from someone like Robin Hanson to provide depth. Not balance, but depth; though, a portrayal of the genuine benefits of future technology would go some way to providing that depth.

* I.e., ticks all the focus group approved boxes

** Broadchurch also did this. Maybe it's a British thing.

Monday, October 17, 2016

Imperium Mendaciorum

I like NPR, or at least I used to when I was growing up. There was a lot of unique reporting that you couldn't find anywhere – stuff like flute making in Kazakhstan.

"We're here in a remote village on the outskirts of Astana, Kazakhstan with Kiril Kurobayev. [market chatter, distant muezzin, pots clanking] Kiril lives with his wife and three children and extended family ... [Kazakh dialogue fading to translator]"

Basically a science fiction short story, except it was real. The previous excerpt wasn't real; I just made that up for effect. Now NPR mostly pushes a hard left line. Not that it wasn't always leftist; it was. But it wasn't always so overt.* That used to be the style of right-wing radio. Good ol' right-wing talk radio. Entertaining in its own way. NPR might report about the four trillion dollar National Debt (under Bill Clinton), but Rush Limbaugh could make you get mad about how high it was getting.

Twenty years later, NPR is doing its best to make you mad about not increasing the National Debt past the nineteen trillion or so it is now. And the methods that NPR and the normally left-leaning media use are veering away from their older dispassionate stance towards emotional "win at all costs" propaganda. This bit of insight from a recent Intelligence Squared debate.

Ben Domenech:
... the things that were printed about Mitt Romney in 2012 in the New York Times by Paul Krugman called him a "charlatan," "pathologically dishonest," "untrustworthy." He said he didn't even pretend to care about poor people, that he wants people to die so that rich people get richer. "He's completely amoral, a dangerous fool, ignorant as well as uncaring."

Male Speaker:
Sounds familiar.

[Laughter]

Ben Domenech:
If you cry wolf long enough, sometimes the beast actually shows up, okay?
And when that happens, they no longer had a vocabulary that could be used, because everyone tunes them out and says, "Well, you were saying that about this nice Mormon businessman, you know, four years ago."

That's Paul Krugman, Nobel Prize winner in economics writing in the most prestigious newspaper in the US and possibly the world! I don't know if the WSJ said similar things about Obama, though I wouldn't be surprised.

But I'd be doubly unsurprised if Wikileaks showed active collusion between Krugman and the Obama campaign the same way the recent Wikileaks releases of the Podesta e-mails show the New York Times (et al) and even the Justice Department colluding with Democratic Party hierarchs to help Hillary Clinton.

It's not all bad since these releases presumably contain the truth, or at least what people really think is the truth. I'd go for a jab about private versus public convictions, but I don't think it's a big deal. It's even SOP in Japan and probably most European countries too. Except France, judging by the etymology of frank.

Most interestingly, the releases support a hunch I've had regarding our economic health. Things are worse than the headline figures suggest and that labor participation rates, contrary to what Matt Phillips over at Quartz believes, support this idea. But I'll spend some time showing why he's wrong in greater depth later.

* About the only NPR program worth listening to these days is Goats and Soda.

Also, why have the TED talks gone from introducing the public to genuine innovation and novel insights to vapid vaguely leftist pop-psychology? Everything else gets dumbed down so you'd expect the niche that the TED talks provide to flourish.

Wednesday, September 21, 2016

Windows 10 (is) for Dummies

So I get on my computer and Windows has decided to update itself unilaterally. It wouldn't be so bad if it restored the state of my desktop instead of resetting it - i.e. the equivalent of a janitor completely clearing your desk so you have no idea where you left off when you come back to work in the morning. More than that, this "Anniversary" edition update also reset my desktop background.

Why? I know it's simple to restore but if Microsoft couldn't get that detail right, who knows what else they might have broken under the hood. It's the sort of thing that makes me want to reformat. But I'd still reinstall with Windows 10 because Windows 7 is no longer supported and Windows 8 will no longer have support in a few months. Similarly, software developers are now mostly working in a Windows 10 environment which means higher compatibility and lower risk for end users using Windows 10.

It's not all bad. With Windows 10, when I click on a file and press delete, it no longer asks whether I really want to send the file to the Recycle Bin. I can also have variable window sizes and have Windows automatically resize another window to fit the remaining desktop space. I'm sure those were possible in previous versions, but any improvement to default behavior is welcome. There's also DirectX 12, but by the time that becomes relevant, there will probably be a new version of Windows out.

Does that outweigh the increased telemetry and obnoxious update behavior of Windows 10? Privacy and a predictable user experience are important to me so if Microsoft - and Google - continue as they have been, I'm seeing Linux and Blackberry in my future.

The computing environment when I was growing up was basically: programs do not do anything without your permission. Rebooting my computer without my permission and sending info about how I write to the oh-so secure and reliable cloud is something that used to be called malware.

Wednesday, August 3, 2016

Two Studies Walk Into a Journal

The first study finds that being overweight does not increase the odds of a heart attack or death. The second study finds people who get their protein from animal sources are more likely to get a heart attack or die.

The first study was based on observations from identical twins* whereas the second is ~~worthless~~ based on food questionnaires and follow ups. But, there's little doubt which will win the mindshare battle. The former I read about in an extremist crank's blog whereas I was made aware of the study about beans and tofu being superior through a fashionable technology blog; I can already hear the NPR segment and see the fluff piece in a checkout stand magazine and Reddit post "source".

As a former science teacher, I've come to regret the popularization of science which has resulted in a dumbing down of science rather than improving the scientific literacy of the interested population. On first glance, I figured that a more approachable and entertaining look at concepts could only be for the better but I now believe that in the long run, the fun Vsauce-style approach enervates rigor.**

* Identical twin studies ought to be the baseline for any epidemiological study. Studies based on questionnaires are literal trash.

** This despite what The Atlantic suggests regarding the popular channel. But The Atlantic has also been garbage for the past few years so no surprise. I resubscribe every now and then hoping it's stopped, or slowed, the grinding of its ideological axe. But no. America's balkanization is real. Inevitable.

Thursday, July 28, 2016

IPC redux

Anandtech brings us an article today celebrating the 10th anniversary of Intel's Core 2 Duo, a processor line that brought a return to Intel's technological dominance over AMD. I've already looked at processor growth over the years so it's not really new to me. It's a good article though and Ian's articles are some of the best (although Anandtech without Anand Lal Shimpi is like Apple without Steve Jobs).

His dataset for graphs makes the apparent case that there's been a real and steady progression in performance when there really hasn't. He's very much aware of the IPC stagnation over the past decade so it's mainly just a confusing choice of samples.

"Here's this new processor and here's this 10 year old one and look at all these others that fall in between"

But I figured I'd revisit the topic to clarify my own thinking.

Here's a table that I believe expresses the state of CPU advancement more accurately. The data reflect how processors perform on an unoptimized version of one of the oldest benchmarks around, Dhrystone. The Dhrystone program is tiny, about 16KB, and basically eliminates the ~~trickery~~ effects of large caches and other subsystems; it's meant to measure pure CPU integer performance which is the basis of most programs.

Maybe I've posted this table before. I can't be bothered to check and halt this stream of consciousness. Dhrystone data were gathered from Roy Longbottom's site.

Year		IPC D2.1 NoOpt	NoOpt IPC chg	Yearly	MHz	Clock change	HWBOT OC	IPC * Clock change (OC)	Yearly
1989	AMD 386	0.11			40		40
1991	i486	0.19	66%	33%	66	180%	112	365%	182%
1994	Pentium	0.25	34%	11%	75	108%	233	179%	60%
1997	Pentium II	0.45	80%	27%	300	124%	523	304%	101%
2000	Pentium III	0.47	3%	1%	1000	206%	1600	214%	71%
2002	Pentium 4	0.14	-70%	-35%	3066	166%	4250	-19%	-10%
2006	Core 2 Duo	0.52	268%	67%	2400	-9%	3852	234%	58%
2009	Core i7 930	0.54	4%	1%	3066	9%	4213	14%	5%
2012	3930K	0.51	-5%	-2%	4730	11%	4680	5%	2%
2013	4820K	0.52	0%	0%	3900	-1%	4615	-1%	-1%

"IPC D2.1 NoOpt" is a relative measure of how many instructions per cycle each processor is able to complete. Although there are optimized versions of the Dhrystone benchmark for specific processors, it's important to avoid those in a comparison looking purely at CPU performance. Optimizing for benchmarks used to be common practice and definitely gives a cheating kind of vibe.

Looking at the IPC, there were enormous gains in the eighties and nineties where each new generation could do significantly more work per cycle. A 286 to a 386 to a 486 were all clear upgrades even if the clock speed was the same (all had 25MHz versions). This pattern was broken with the Pentium III which was basically Intel just taking advantage of the public's expectation that it would outclass the previous generation. I know I was less than thrilled that many people enjoyed most of the performance of my pricey hot-rod Pentium III at a fraction of the cost with the legendary Celeron 300A.

Pentium 4 went even further such that someone "upgrading" from a Pentium III 1GHz to a Pentium 4 1.3 GHz (P4 1 GHz did not exist) would be actually end up with a slower computer. And interestingly, the much vaunted Core i-series did not bring that much to the table over Core 2 in terms of IPC.

The Pentium III and Pentium 4 acquitted themselves with huge increases in clock speed which is the marquee feature for CPUs and it appears that the Core i-series followed the same pattern. But when you look at clock speeds that the large pool of overclockers average over at HWBOT, we see that Core 2 Duo was capable of some very good speeds. The HWBOT (air/water) scores are useful in finding out the upper limits to clock speed for a given architecture rather than the artificial limits Intel creates for marketing segmentation.

There are many people still using Core 2 Duos and for good reason. While it is certainly slower than the Core i-series, it's not too far off

... particularly if human perception of computing speed is logarithmic as it seems to be for many phenomena.

There are many reasons to go for newer architectures, e.g., lower power use, higher clockspeeds for locked CPUs, larger caches, additional instruction sets, Intel's lock-in philosophy etc. In retrospect, we can see the characteristic S-curve (or a plateauing if you are looking at the log graph) of technology development so it all makes sense - the Pentium 4 being a stray data point. An aberration. Something to dismiss in a post-study discussion or hide with a prominent line of best fit. There, so simple. Companies pay millions to frauds like Gartner Group for this kind of analysis! But you, singular, the reader, are getting it for free.

At the time, however, it seemed like the performance gains would go on forever with exotic technologies neatly taking taking the baton whereas the only technologies I see making a difference now - at least for CMOS - are caching related. Someday CPUs will fetch instructions like Kafka's Imperial messenger forever getting lost in another layer of cache. But if Zen turns to Zeno, it won't matter, Dhrystone will expose it all.

Kurzweil talks about S-curves being subsets of other S-curves like a fractal of snakes and maybe that's where we're at. Or is his mind the product of the 1970-2000 period of Moore's Law, forever wired to think in exponential terms?

Wednesday, July 27, 2016

Improving Game Loading Times

Faster computing is a game of eliminating bottlenecks. Every component in a system is waiting for something, whether it be the results from CPU calculations, a piece of information from memory, storage, or the network, or even just input from the user.

Ideally, the computer would always be waiting on the user rather than the other way around. For the most part, today's computing experience approaches that ideal for most. This is why it's all the more jarring when you do have to wait which is often the case with large games.

For game loading, a common piece of advice to improve load times is to get an SSD if you are using a regular hard drive as storage. And it definitely helps.

Regular hard drives are so much more sluggish that replacing them with SSDs improves the general responsiveness of computers more than just about any other upgrade. And for game loading times, it makes sense that faster storage devices lead to faster loading times. But at some point, storage devices will become so fast that they will no longer be the bottleneck.

It turns if you have an SSD, you are already there because even if you increase the speed of your storage device by an order of magnitude, as is the case with RAM drives versus SSDs, game loading times are basically unchanged.**

Why?

For many programs, the bottleneck moves back to the CPU and the rest of the system. Rather than the CPU waiting on storage, it's the user waiting on the CPU to process the instructions that setup the game. To demonstrate this, I clocked my CPU at 1.2, 2.4, 3.6, and 4.8GHz, and then measured initial and subsequent loading times for Killing Floor 2.*

Although it is clear that a faster CPU helps loading times, the benefits become smaller as CPU frequency increases - even looking at things with a percentage change perspective, i.e., the load time of the 2.4GHz run was 62% of the 1.2GHz run despite its 100% clock speed advantage and the load time of the 4.8GHz run was 71% of the 2.4GHz run despite its 100% clock speed advantage. In addition, it is also exponentially more difficult to increase CPU frequencies. Thankfully overclocks in the low to mid 4GHz range happen to be the sweet spot for the processors Intel has released over the past few years so most of the load time benefits can be realized for those with an overclockable system or the latest processors e.g. 4790k turbos to 4.4GHz and 6700k boosts to 4.2.

The second load runs were performed to see the effect of the Windows cache. These runs were about 15 seconds faster across all CPU frequencies. If having parts of the program preloaded into memory can save that much time, it makes me wonder why RAM drives don't perform better. I have a feeling that overhead from the file system might be to blame. Maybe game resources are unpacked in RAM whereas they still need to be decompressed even on a RAM drive. I'd have to unpack the files, if they even are compressed, to test the theory. This has the benefit of shifting some of the burden off the CPU and onto storage. Right now my Killing Floor 2 directory takes up 30GB so even mild compression can easily have it balloon to a size where it doesn't really make sense to reduce loading time but use up precious SSD space. It's worth trying someday.

In any case, the Windows cache after running Killing Floor was 3GB. If all of that represents game assets loaded straight from disk, then that represents at least six seconds of the 15 - probably more given the the maximum read rates for the SSD of around 200MB/s during loading and the important 4K QD1 performance of typical SSD drives, like mine, which is about 29MB/s..

Benching my SSD (Seagate 600 Pro 240GB SSD)

Then again, almost all online RAM drive vs SSD game loading time comparisons suggest this is not the case.

4K QD1 read performance is important because games, and most typical programs, mostly do low queue depth accesses. Resource monitor showed a queue less than 3 during loading. This type of workload is tough to optimize and even the fastest multi-thousand dollar NVMe PCIe datacenter SSDs are no better than a decent consumer drive using old fashioned SATA. The 4K QD1 in this case may be a red herring given the lack of a RAM drive advantage even on a 4.5GHz 2500k.

As an aside which I'm not really going to separate, the best SSDs, like the Samsung SM961 can do 60MB/s for 4K QD1. It's a very good performance and is basically what the ACARD ANS-9010, a SATA based drive that uses much faster DDR2 DRAM, could do (63MB/s or 70MB/s or 55MB/s depending on who you ask). On the other hand, it shows just how much overhead can hamper performance. This user was able to get double or triple the performance (130 to 210 MB/s) on DDR2 and a Core 2 Duo with a software RAM drive. I don't know if its SATA overhead or what, but that's a very significant hit.

Now if only someone would make a PCIe drive using DDR3...

RAM drive bench on my system (3930k 4.6GHz DDR3-2133)

Imagine what a newer computer with DDR4 4000+ could do! (It'd probably fill up the bars) But I'd still rather have a hardware solution over software. That's how I feel about all tasks where there's a choice between hardware and software. REAL TIME. DEDICATED. GUARANTEED PERFORMANCE. - not - "your task will be completed when the Windows scheduler feels like it and as long as the hundred other programs running play nice with each other"

Anyway, performance monitoring software revealed some other interesting facts during the test. Even on the 1.2GHz run, the maximum CPU thread use was 80% even though it is clear that the CPU speed was constraining the load time. (It was 60% for first run at 4.8GHz, 45% for subsequent). The unused CPU capacity might be the result of a race condition but I think it's safe to say there is room for optimization software or hardware side.**

But it's completely understandable if game loading time is very low on the list of developer priorities.

* From starting the program to firing first shot with loading screens disabled using the
"-nostartupmovies" launch option. This actually saves a good amount of time.

** I tried changing core affinities and counts, priority, hyperthreading, RAM speed (1066-2133), and there were no changes. I'm using a stopwatch so there might have been advantages but nothing like the effect CPU frequency had.

Wednesday, July 13, 2016

Black Friday in July??

The Friday after Thanksgiving, Black Friday, is supposedly a shopping festival with Bacchanalian levels of chaos. Stampedes, people in tents camping outside stores, fights - all of which probably get retransmitted around the world to help shape others' image of Americana. But efficient markets have largely neutered whatever actual savings people might find on Black Friday with stores only stocking enough doorbuster items to stay out of legal trouble and online retailers finding their headline sales out of stock and on eBay in seconds.

Black Friday was never great and will only fade into greater obscurity as preference for online shopping grows. So why, in the name of the Senate and American Republic, am I getting ads for Black Friday in July? Its excess is comedic, like the Thursday Afternoon Monday Morning podcast or the Spishak Mach 20. Naturally I wonder if we'll be seeing a Black Friday in July Weekend Extravaganza! As an inveterate capitalist, I've made my bed. Ear-splitting advertising noise is the will of the Market, the Market from which all First World problems flow.

Speaking of First World problems, my quick HTC Vive review:

Cons

very visible screendoor / low resolution
grainy display with poor black levels for an OLED panel
visible Fresnel lines and chromatic aberration
low FOV
heavy and stuffy headset
sometimes tricky setup
huge hardware requirements if you want to use supersampling
glitches with lighthouse tracking
quality at periphery is not good
cables
desktop mode is difficult to use and has a lot of latency
finicky adjustment for each persons eyes
the shipping box was an order of magnitude larger than it needed to be
Room-scale requires clearing a large chunk of space and applications that use it always leave you wanting more
Lighthouse boxes emit a motor whine. Not a fan of moving mechanical parts.

Pros

tracking precision is very good when it works, which is most of the time
responsiveness should stave off motion sickness for most
colors and dynamic range are good
controllers are excellent
SteamVR integration is well done
overall impression of quality materials
Lighthouse "room-scale" makes this clearly superior to the Rift in applications that can use it although the Rift is lighter and has better optics

All in all, contemporary VR is an amazing achievement. Lighter headsets and better lenses are probably on the way along with more wireless parts but improving the big visual cons, i.e., low resolution, supersampling, and low FOV, will require an enormous increase in graphics power. Right now I'm using an overclocked 980 Ti, but supersampling will likely require next year's 1080 Ti. Possibly two if NVIDIA decide on the typical 20% improvement instead of the 50% improvement they did as a kind of one-off with the 980 Ti to deflate AMD's Fury launch. But it seems that by locking down many overclocking voltage settings with the current 1070/1080, NVIDIA is keeping plenty of performance bottled up should they need to counter a big AMD launch again.

Wednesday, May 25, 2016

The End of Moore's Road: Sensor Edition

So I've been looking for a new camera, one that can record 4K 60fps video. It doesn't exist unless you count the $6000 Canon 1DX Mark II which is huge and weighs a ton. It's hard to go back to 1080p after seeing the rich details 4K offers but it's also hard to go back to 30fps after watching the smooth motion 60fps is capable of.

Right now you can get 4K or 60fps despite the fact that there are a number of relatively inexpensive ~$1000 cameras that can do near native 1080p @ 240fps. The bandwidth and processing requirements are similar but companies don't seem to see the need. Oh well. It's hard to choose 4K or 60fps even though either would be an improvement over my current 1080p30 setup. I guess that makes me Buridan's Ass.*

Anyway, one thing I discovered during my search is that image sensor quantum efficiency is around 60%.

Wow! This means that there is under a stop left of high ISO performance left before nature itself places a hard limit on improvement. This milestone sort of snuck up on me even high ISO performance is one of the most discussed aspects of digital imaging. Although this quantum efficiency level, quantitatively speaking, isn't as impressive as the technological records achieved in trying to reach absolute zero, semiconductor process sizes approaching the size of a handful of atoms, or even something like Vantablack, it's significant from a photographic standpoint.

There is probably around another stop of sensitivity available by replacing the color mosaic layer used in sensors with a 3 chip array which is sometimes used in video cameras. That approach represents significant enough higher cost and complexity that it will probably be a last resort after signal processing approaches have been exhausted.

Resolution wise, there's still plenty of room. Although diffraction limited optics exist - something which also amazes me - most lenses do not exhibit that high performance wide open. But as manufacturing improves and exotic lens shapes - such as those used in the Nokia 808 pictured below - become feasible, a diffraction limited f/2.8 lens can resolve nearly 400 megapixels on full frame; the current full frame megapixel champ maxes at around 50. Looking at Sony's sensors, the IMX318 has the finest pixel pitch at a computationally convenient 1.0 micrometers which implies a full frame scaling of 864 megapixels. If per pixel full color accuracy is desired, 1600 "Bayer megapixels" will be required to approximate 400 full color megapixels. Given the state of semiconductor manufacturing, this is definitely within the realm of possibility. In that sense, the effects of the end of Moore's law probably lie beyond the diffraction wall.

NO SPHERICS ALLOWED! (Image from test-mobile.fr)

Dynamic range capability is related to signal to noise ratios. Quantum efficiency has a role in improving the signal quality while improvements to read noise can boost SnR in tandem. But current technologies impose another limitation: full well capacity. Full well capacity tends to be lower with smaller photosites. However, multiple photosites with smaller full well capacity are equivalent to the same sized single photo site with its larger full well capacity. According to DPReview, read noise is the only reason that sensors using larger photosites marginally outperform similar sensors with smaller photosites. If photon counting technology can be developed, read noise is not only essentially eliminated, but full well capacity is no longer an issue. This implies enormously higher dynamic range capability.

All of these improvements, save 3CCD, are dealing with a single sensor. But huge gains in image quality can be obtained by using multiple sensors for 3D imaging or processing tricks like the multiple lens/sensor ~~gimmick~~ approach by L16 or ~~Leica~~ Huawei P9. So while the high ISO sensor performance race is finished, there are plenty of events left.

* If Sony keeps its June release cadence, maybe the Alpha a6400 or RX100V will have 4K60. The GH5 might but I'm hoping the feature hits smartphones first.

QuickSync, a broken dream

The reason I'm writing this is because of my experiences in helping people stream and record on computers using integrated graphics. In my mind, I figured QuickSync was similar to AMD's VCE or NVIDIA's NVENC in that using hardware accelerated encoding would impose a minor performance penalty. In practice, Intel's implementation does impose a huge penalty if you are playing games on Intel's GPUs.

QuickSync is Intel's technology for encoding video. The idea is that their CPUs would have a bit of circuitry dedicated to accelerating video encoding and free the rest of the CPU to do other things e.g. game physics. It works but there are some major caveats:

Quicksync depends on the processor's integrated GPU (iGPU) which means that if the iGPU is being used, Quicksync performance falls. This means that this feature isn't really very useful for people playing somewhat demanding games on iGPUs - the area where hardware accelerated encoding would help the most. It's fine for encoding video or doing basic screencasts but it simply shares too many resources with the rest of the integrated GPU to work effectively while playing 3D games.

Now I'm not sure how much it would have cost to have Quicksync implemented as a fully discrete hardware solution; Intel probably reasoned that streaming and recording gameplay is an upper tier feature not needed by typical iGPU users and that users who do stream and record games would probably have a discrete GPU anyway, but it is unfortunate. Quicksync has been around for several years and so it's understandable that its earliest forms which were targeted toward video conversion might have left gaming uses as an afterthought. But given the huge growth in streaming and gameplay recording versus realtime movie file encoding, it's surprising that Quicksync has not adapted.

Quicksync does work on iGPU systems, but performance is inversely proportional to gameplay demands. This means that the whole recording and streaming process is very inconsistent. This performance relationship also exists even with a dedicated GPU if software x264 encoding schemes are used*, though this is rarely a problem on quad core+ systems.

So my recommendation for potential streamers and people hoping to record gameplay is to always have a discrete video card, even if it is hardly faster than the iGPU. This will ensure that all iGPU resources are free for Quicksync. In that case, all the iGPU is being used for is encoding and so it functions as truly dedicated hardware.

But for integrated graphics users, x264 on quad core and higher systems is typically going to be better than QuickSync for streaming, while the very low compression, i.e. low processing load, codecs used by FRAPS and DXTORY are going to be the best for recording. Of course the file sizes are relatively enormous but there ain't no such thing as a free lunch.

It would be nice to set aside iGPU resources for Quicksync and adjust the game's graphical settings after the fact.

* This is why I prefer to set processor affinities for non-streaming PC setups as you can set aside appropriate resources for a given quality level and know that no matter what happens in game, streaming quality remains consistent. It is a kind of virtual streaming PC.

Wednesday, May 18, 2016

The 2010s Aesthetic

Where did this come from? In video terms I'm talking about the slow motion, unrealistically graded, low depth of field, jump cut, steadicam look. Time lapse with sliders, drone shots. Add affected piano, guitar, and/or indie female vocal. Is this touchy-feely stuff because of Apple? I'll bet it is. Or maybe Instagram. Is your photo uninteresting but you want others to think it's meaningful? Add some vignetting, apply a film look, and maybe make it black and white. But can't lie, it works.

After years of this though, one wonders what the point is. There is no point, it's just noise. And this noise engulfs the only meaningful application of video and photography - to help tell a story. Not tell a story, but help. Otherwise viewers are making up meaning arbitrarily or in uselessly vague terms. Like there might be a photo of a worn out door in India. Oooh the stories it could tell! is the impression the photographer wants to give but I can't help but think "A picture is not literally worth a thousand words dummy, tell me what's going on". No one on the planet, without the proper context, would ever hear The Great Gate at Kiev and deduce that the piece had anything to do with Kiev, a gate, or pictures at an exhibition. That's why contemporary classical music is so sterile whereas soundtracks, which help tell a story, aren't. It's why a straight up gallery of "award winning" photos is inferior to the photos in a National Geographic article.

The indie videographer and photographer crowd rarely tell stories and when they do, it's often subsumed under fancy technique and practiced faux-earnest narration. People in the future will look back, presumably, with the same bemused eye we look at kaleidoscope filter photos from the 70s. The photos they will be interested in, however, is the mundane slice of life back then sort of thing that I believe is the real draw for Bresson or Weegee's photos.

Or maybe I've unfairly implicated the purveyors of the 2010 aesthetic in a grand Sokal-esque conspiracy and I simply don't get it.

Saturday, May 14, 2016

A closer look at flight costs

Youtube user Wendoverproductions covers costs associated with plane tickets in good detail although there's a lot of vocal uptalk in the presentation.

Looking at things from the passenger side, I think the earlier case I made suggesting that shipping from China is mostly subsidized might be wrong. If airlines are profitable with $80 ticket prices hauling around 150lb people across the country and fuel costs are a small fraction of actual expenses, then sub $1/lb shipping rates for freight seem reasonable.

Saturday, May 7, 2016

Rockefeller problems

Just thinking about my previous entry on whether things are getting better or worse for the American middle class on an absolute scale ...

One of the themes the "things are worse" crowd brings up as evidence of the absolute decline of the middle class is the shift to dual income households. A comfortable family lifestyle with a single breadwinner was typical in the 50s. The Atlantic suggests that a comfortable middle class lifestyle today requires an income of over $130,000 - definitely not typical. That's not even typical for two income households. So in that important sense, things have gotten worse.

Technological progress hides this decline. Without the increased productivity and technological progress, there wouldn't be any argument; a situation where one person was able to provide well for a family of four that changes to where two people are even less able to provide for that family is a disaster. But if workers are providing more goods and services than ever before, then one worker should be able to provide more for a family.

Intuitively, as material abundance grows, there is less need to work and greater financial security. In other words, if we had Star Trek levels of abundance, barely anyone needs to work whereas in times of extreme scarcity everyone is always working. Before the Industrial Age, for example, most people - children included - were working most of the day on farms.*

And yet here we are with increasing productivity but also having to work more just to maintain our living standard. It's something analogous to stagflation (what an ugly word) although it is more than an analogy since both problems share many of the same causes. Maybe I've written about them before but I'm sure I'll write about them again.

* That's one thing that history books don't really cover. There might be a small section on how peasants lived but the majority of the rest: battles, cities, kings, queens, inventors, philosophers, etc., represent only a tiny portion of the human experience. There's only so much to be written (but much more to be said) about toiling in fields and simple family life I guess. And in truth, the typical first worlder's life has more in common with city life and royalty than subsistence farm living.

Someone out there wondered whether it would be preferable to be a Rockefeller during their heyday or a regular American today. Professor Don Bordreaux and the modern camp point to the amazing technologies and conveniences that are within the reach of typical Americans that even Rockefeller simply had no access to: advanced medicine, supermarkets, cheap air travel, better cars, instantaneous communication, access to incomprehensibly greater information and entertainment than ever before, etc. The Rockefeller camp, like Peter Schiff, point out that Rockefeller lived in a modern enough era that travel, entertainment, and access to information, were plentiful enough. Having Netflix is nice but not having to worry about financial security is even nicer.

If you are materialistic, Bordreaux is absolutely right. Materialistic has a negative connotation, but I mean a strong as opposed to a more indifferent preference for goods and services that are more varied, higher quality, and cheaper. It's this type of materialistic thinking that we have to thank for the standard of living we enjoy today. If you are more worried about status and stability, Schiff is right. Perhaps related: Bordreaux hates Trump while Schiff is more sympathetic (though not particularly supportive).