NVIDIA Tegra K1 Preview & Architecture Analysis
by Brian Klug & Anand Lal Shimpi on January 6, 2014 6:31 AM ESTTegra K1 ISP & Video
NVIDIA’s Tegra K1 SoC also makes some dramatic improvements on the ISP side. We saw SoCs start arriving with two ISPs sometime in 2013, which allowed OEMs to deliver a host of new imaging experiences, like shot in shot video and simultaneous use of both front and rear cameras. With Tegra K1, NVIDIA is not only moving to two ISPs, but it’s also making ISP more of a first class citizen.
For those not familiar, ISP (Image Signal Processor) handles the imaging pipeline for still photos, video, and performs tasks like Bayer to RGB conversion (demosaicing), 3A (Autofocus, Auto Exposure, Auto white balance), noise reduction, lens correction, and so on. Although NVIDIA has always included an ISP onboard, I couldn’t shake the feeling that still imaging performance could’ve been better, especially in the few cases that allowed direct comparison (HTC One X). With Tegra K1, there’s more die area dedicated to ISP than in the past, and there are two of them to support the kind of dual camera applications that have quickly become popular.
Tegra K1 includes the third generation of NVIDIA’s ISP, capable of processing 600 MP/s on each ISP with 14 bit input, and support for up to 100 MP cameras. There are two of them, so NVIDIA quotes the total pixel throughput as up to 1.2 Gp/s. This is dramatically increased from Tegra 4, which supported up to 400 Mp/s at 10 bits per pixel. In addition the K1’s ISP now supports up to 4096 focus points, a 64x64 array, for its autofocus routine. The ISP also has better noise reduction, and local tone mapping, a feature we’ve also seen become popular for combining parts of images and recovering some of the dynamic range lost with ever shrinking pixel sizes.
Tegra K1 retains compatibility with the Chimera 1.0 features that we just saw in the Tegra Note 7, like object tracking, always-on HDR, slow motion capture, and full resolution burst, and adds more. NVIDIA has kept the Chimera brand for the K1 SoC, calling it Chimera 2.0, and envisions this architecture enabling things like better temporal pixel binning (combining 8 exposures from the CMOS to drive noise down further), faster panorama, video stabilization, and even better live preview with effects applied. The high level of Chimera seems to be the same – kernels that either run on the CPU, or on the GPU (ostensibly in CUDA this time) before or after the ISP and in a variety of image spaces (Bayer or RGB depending).
On the video side, Tegra K1 continues to support 2160p30 (4K or UHD video at 30FPS) encode and decode. Broken down another way, H.264 High Profile Level 5.1 decode and 4K H.264 High Profile 4.2 encode. The fact that there’s a Kepler next door made me suspect that NVENC was used for most of these tasks, but it turns out that NVIDIA still has discrete blocks for video encode of H.264, VP8, VC1, and others. These are the same video encode and decode blocks as what were used in Tegra 4, but with some further optimizations for power and efficiency. The Tegra K1 platform includes support for H.265 video decode as well, but this isn’t accelerated fully in hardware, rather the decode is split across NVENC and CPU.
NVIDIA showed off a K1 reference board doing 4Kp30 H.264 decode on an attached display, I didn’t notice any dropped frames. Of course that’s a given considering we saw the same thing on Tegra 4, but it’s still worth noting that the SoC is capable of driving 4K/UHD displays over eDP 1.4, LVDS and HDMI 1.4b.
The full GPIO breakdown for Tegra K1 includes essentially all the requisite connectivity you’d expect for a mobile SoC. For USB there’s 3 USB 2.0 ports, and 2 USB 3.0 ports. For storage Tegra K1 supports eMMC up to version 4.5.1, and there’s PCIe x4 which can be configured
88 Comments
View All Comments
davidjin - Thursday, January 23, 2014 - link
Not true. v8 introduces new MMU design and page table format and other enhanced features, which are not readily compatible to a straightforward "re-compiled" kernel.By App-wise, you are right. Re-compilation will almost do the trick. However, without a nice kernel, how do you run the Apps?
deltatux - Saturday, January 25, 2014 - link
This is no different than the AMD64 implementation, the Linux kernel will gain a few improvements to make it work perfectly on that specific architecture extension, the same is likely going to happen to aarch64. That means, there's no reason to make a 32-bit version of ARMv8. ARMv8 will be able to encompass and execute both 64-bit and 32-bit software.As shown by NVIDIA's own tech demo, Android is already 64-bit capable, thus, no need to continually run in 32-bit mode.
Laxaa - Monday, January 6, 2014 - link
Kepler K1 with Denver in Surface 3?nafhan - Monday, January 6, 2014 - link
I'd be more interested in a Kepler Steam Box.B3an - Monday, January 6, 2014 - link
... You can already get Kepler Steam boxes. They use desktop Kepler GPU's.Surface 3 with K1 would be much more interesting, and unlike Steam OS, actually useful and worthwhile.
Alexvrb - Monday, January 6, 2014 - link
Agreed. The Surface 2 is a well done piece of hardware. It just needs some real muscle now. As the article says, it would be a good opportunity to showcase X360-level games like GTA V.name99 - Monday, January 6, 2014 - link
Right. That's the reason Surface 2 isn't selling, it doesn't have "real muscle". Hell, put a POWER8 CPU in there with 144GB and some 10GbE connectors and you'll have a product that takes over the world...stingerman - Tuesday, January 7, 2014 - link
I doubt Microsoft will sell a device that competes against the 360... It would signal to investors what they already suspect: Gaming Consoles are end of line. Problem is that Apple is now in a better position...klmx - Monday, January 6, 2014 - link
I love how Nvidia's marketing department is selling the dual Denver cores as "supercores"MonkeyPaw - Monday, January 6, 2014 - link
They are at least extra big compared to an A15, so it's at least accurate by a size comparison. ;)