AND nVidia

optodata

Senior Contributor Location: California, USA Joined: Sep 16, 2011 16:04 Messages: 8630 Offline

Oct 10, 2015 03:44

You can see that on the GPU-Z screen at about 8 seconds in. I just noticed that my screen recordings are suffering from the "PD thinks these source clips are interlaced" glitch.

I don't think there's enough info in them to produce and upload them again, but I can do that in the morning if that would be helpful. It's almost 01:00 here now

YouTube/optodata

DS365 | Win11 Pro | Ryzen 9 3950X | RTX 4070 Ti | 32GB RAM | 10TB SSDs | 5K+4K HDR monitors

Canon Vixia GX10 (4K 60p) | HF G30 (HD 60p) | Yi Action+ 4K | 360Fly 4K 360°

Julien Pierre

Contributor Joined: Apr 14, 2011 01:34 Messages: 476 Offline

Oct 10, 2015 10:57

Quote: OK, back to the testing.

I tried the Google docs link again from my apt and it worked normally. It took about 15 seconds to download the zipped project and I was able to produce to "H.264 M2TS, AVC 4K 3840x2160/30p 50 Mbps, with the hardware encoder" in 4:54 on my system using PD14 v2109 and the newest nVidia driver 358.50.

My 780Ti card has 3072MB on board and GPU-Z showed roughly 2800MB in use. The GPU load was typically only at 30% while my CPU was pegged at 100% the entire time. The produced video looks and plays normally as far as I can tell, although I only have 2 HD monitors and can't watch it in 4k.

I ran the same test using the nVidia 337.88 driver, which usually gives me the fastest results - but it actually took 5:11 to produce, which is 17 seconds longer! The GPU memory was pretty much fixed at 2003MB the entire time, and that was noticeably lower than with the 358.50 driver. In this case, it seems the higher the GPU memory usage, the quicker the video is produced.

Thanks for trying. Did you look at anything else in GPU-Z, like GPU load, memory controller, video engine, bus load, by any chance ?

It's become quite to me clear that NewBlue titler is the main culprit here - not so much as the various versions of the nVidia drivers.

Jeff (JL_JL) was very helpful in creating a version of the project that uses the default title object instead of Newblue title.

I just did a whole bunch of tests and recorded tons of info during the rendering. Another sleepless night for me.

I will post the results in the thread about NewBlue at http://forum.cyberlink.com/forum/posts/list/45843.page

What I did note was that there was fairly different baseline GPU RAM usage after starting Windows with just PD, between the different versions of nVidia drivers - sometimes different by more than 200MB . The 337.88 seems to have lower baseline memory usage. But not better rendering performance, in most cases - except when in the cases where PD happens to run out of memory, when using NewBlue.

Also, merely having a few browser windows opened increases the GPU RAM usage significantly, especially since I'm running Aero. While re-running tests without any browser running, I didn't see any "GPU out of memory" error with any of the drivers, even with NewBlue.

I also tested which formats PD14 would let me use the HE on using AVC H.264, and I was able to select any progressive M2TS format and use HE. All HD, 2k and 4k progressive profiles could use HE.

After doing one more DDU driver uninstall and reinstall, I'm seeing the same exact choices you are being offered - all progressive choices OK for M2TS for hardware encode, and all MP4.

Not quite sure why it was glitchy yesterday when I took the capture. Go figure

So at least in my case, the newer nVidia driver is better (faster), although neither showed the "no video" issue with the RTM version of PD14.

FYI, that "audio-only" bug was only occurring with later versions of the drivers, like the 350-355 series. If you weren't on one of these, you wouldn't have seen that bug. It might also have been specific to the 750Ti , while you have a 780Ti.

347.88 was OK AFAIK as having both audio & video in the M2TS . The latest PD14 2130 patch seems to have fixed it for all driver versions I tried , but I didn't retest all of them

BTW - according to my data, with drivers 337.88, doing a hardware encode, my rendering time was 276 seconds - ie . 4:36 - somewhat better than your 5:11 . This is test #7 in the table I will post in the other thread.

With drivers 347.88, the rendering time was 250 seconds, somewhat better - or 4:10 . This is test #14 in the same table.

With drivers 358.50, the rendering time was 246 seconds, ie. 4:06, compared with your 4:54. This will be test #19 in the same table.

So, at least for me, the newest drivers seem to have performed better for this project, when nothing else was opened on the system.

My CPU also was no peaked during any of my rendering tests for the project that included NewBlue. It never went beyond 90%.

And in some (anonmalous) cases, dropped to the 30s% range.

What I'm wondering about here is why you are reporting slower times than I'm, given your 780Ti with 3GB vs my 750Ti with 2GB.

And also, your CPU is an Intel Haswell i7-4770k which cost much more (up to twice as much on Newegg, twhen that Intel part was still available) and was released in June 2013, and which I thought would perform better than my FX-8350 which was released in October 2012 and whichI acquired right upon release. All the independent benchmarks reviews I read pointed to the Intel chip being significantly faster, and the FX-8350 being closer to i5 performance of the time.

However, I did spend a lot of time stabilizing the 4.6 Ghz OC on my FX CPU, so perhaps I really did manage to get a manage that outperforms your Intel chip significantly. But it does surprise me to see such difference given that your chip includes Quicksync, while mine does not, and Cyberlink touts its multi GP-GPU technology. Perhaps that's what explains the higher CPU utilization on your machine, and maybe it's not actually all it's touted to be

I was planning on perhaps getting an Intel chip in the future for the Quicksync, but maybe it doesn't make sense just to do it for this reason, and I should wait to at least double the number of cores for my next upgrade to make sense.

Or maybe Cyberlink improved the performance between the RTM build of PD14 and patch 2130, and forgot to tell anyone

Julien

This message was edited 2 times. Last update was at Oct 10. 2015 11:34

MSI X99A Raider
Intel i7-5820k @ 4.4 GHz
32GB DDR4 RAM
Gigabyte nVidia GTX 960 4GB
480 GB Patriot Ignite SSD (boot)
2 x 480 GB Sandisk Ultra II SSD (striped)
6 x 1 TB Samsung 860 SSD (striped)

2 x LG 32UD59-B 32" 4K
Asus PB238 23" HD (portrait)

optodata

Senior Contributor Location: California, USA Joined: Sep 16, 2011 16:04 Messages: 8630 Offline

Oct 10, 2015 13:12

I was surprised to see slower numbers with my CPU as well, and even though I have both GPUs enabled, the usage of the HD4600 was 0 when I was producing these projects so PD wasn't using MGPGPU on this project.

I don't remember ever seeing CPU usage at 100% when producing with HE, but then I rarely look at those details unless I'm troublshooting something.

Anyway, the funny thing with Win10 is that it wants to keep everything up-to-date, and during one of the 2 reboots I did after testing your project it installed the nVidia 353.54 driver. Not the newest or second-newest, but the pre-RTM Win10 version from 7/15.

Here's the thing - when I tested your project again with the exact same settings, the production finished in 2:21! Still no activity on the HD4600 and the PCU usage was down around 70%. That's much more like the numbers I'd expect:

This message was edited 1 time. Last update was at Oct 10. 2015 16:16

Julien Pierre

Contributor Joined: Apr 14, 2011 01:34 Messages: 476 Offline

Oct 10, 2015 22:24

Quote:
Here's the thing - when I tested your project again with the exact same settings, the production finished in 2:21! Still no activity on the HD4600 and the PCU usage was down around 70%. That's much more like the numbers I'd expect:

Damnit ! Your previous post almost cured my upgraditis for a while !

What exact version of the drivers were you using when this finished in 2:21 ?

That is a good bit faster than my rendering, almost twice as fast as my 4:36 . A speed doubling is generably a convincing point for me to consider upgrading my system. MSI X99A Raider
Intel i7-5820k @ 4.4 GHz
32GB DDR4 RAM
Gigabyte nVidia GTX 960 4GB
480 GB Patriot Ignite SSD (boot)
2 x 480 GB Sandisk Ultra II SSD (striped)
6 x 1 TB Samsung 860 SSD (striped)

2 x LG 32UD59-B 32" 4K
Asus PB238 23" HD (portrait)

optodata

Senior Contributor Location: California, USA Joined: Sep 16, 2011 16:04 Messages: 8630 Offline

Oct 10, 2015 22:46

That one single test was the only one that had such a short production time. I made almost 2 dozen other tests, with the 337.88, 353.62, 355.60, 355.82, 355.98 and 358.50 drivers, and with one exception they all fell within a narrow range of 2:29 - 2:41. All were far better than my original two tests but none were significantly faster than any other. (If you're interested, the fastest times were with 355.60 and 358.50)

The one exception was the 337.88 driver which took 3:10, so it was bumped off of my go-to list

I also specifically tested PD14 v2109 and 2130 with the 358.50 driver and found only a 1 second difference (2:28 vs. 2:29, respectively) so whatever happened when you saw the big increase was probably unrelated to the update.

I then tried connecting one monitor to my MB's Thunderbolt 2 connector to activate the HD 4600 and use the MGPGPU mode, and even with the 780Ti sharing half the GPU load it still took 10 extra seconds to produce, and the same one-second difference persisted between the 2109 and 2130 versions.

Finally, I kept the system in MGPGPU mode and left all the HA checkmarks enabled, but I produced without the Fast video render box checked (so mostly CPU) and got the finished video in 3:09. I then switched my monitor back to the 780Ti and used CPU encoding again, and this time it took only 2:49, or just 20 seconds longer than using HE.

As for upgraditis on my end, I'm waiting for the 8-core Skylake and supporting hardware to come out, then I'll pull the trigger!

This message was edited 2 times. Last update was at Oct 10. 2015 22:49

Julien Pierre

Contributor Joined: Apr 14, 2011 01:34 Messages: 476 Offline

Oct 11, 2015 01:00

Quote:
I then tried connecting one monitor to my MB's Thunderbolt 2 connector to activate the HD 4600 and use the MGPGPU mode, and even with the 780Ti sharing half the GPU load it still took 10 extra seconds to produce, and the same one-second difference persisted between the 2109 and 2130 versions.

Why does any monitor need to be connected to active MGPGPU mode ?

As for upgraditis on my end, I'm waiting for the 8-core Skylake and supporting hardware to come out, then I'll pull the trigger!

I really haven't followed the Intel stuff so all these names don't mean much to me.

It looks like the 8-core Haswell is $1000+ chip, and there is no 8-core Broadwell.

If the Skylake is also a $1000+ chip, I will never go for it.

Intel seems to want to change sockets every other generation or more, I don't understand it. Even Cannonlike might not use the same LGA 1151 socket at Skylake. And of course both Skylake and Cannonlake require DDR4.

I suppose some change is better than no change at all, with AMD having produced practically no new useful FX chip in the last 3 years, no new corresponding chipset, etc. No USB 3.1, no PCIe 3.0, no SATA express on AMD. It's getting a bit long in the tooth. So I know my next upgrade won't be an AMD. I just don't know when it will be. MSI X99A Raider
Intel i7-5820k @ 4.4 GHz
32GB DDR4 RAM
Gigabyte nVidia GTX 960 4GB
480 GB Patriot Ignite SSD (boot)
2 x 480 GB Sandisk Ultra II SSD (striped)
6 x 1 TB Samsung 860 SSD (striped)

2 x LG 32UD59-B 32" 4K
Asus PB238 23" HD (portrait)

optodata

Senior Contributor Location: California, USA Joined: Sep 16, 2011 16:04 Messages: 8630 Offline

Oct 11, 2015 02:00

Quote: Why does any monitor need to be connected to active MGPGPU mode ?

That's the only way I've found to bring the built-in graphics into action alongside the nVidia card. I normally have two monitors plugged into my 780Ti, and even with the HD 4600 enabled in the UEFI and Device Manager it sat there idly during my tests with your project.

By moving one monitor to the MB connector, I'm requiring the HD4600 to drive it and I then get the Intel QuickSync option on PD's Produce tab rather than Hardware video encoder.

...If the Skylake is also a $1000+ chip, I will never go for it.

Intel seems to want to change sockets every other generation or more, I don't understand it. Even Cannonlike might not use the same LGA 1151 socket at Skylake. And of course both Skylake and Cannonlake require DDR4.

The changes these days are too big to just plug in a new CPU and have everything run faster. Even if it were pin-compatible, the supporting chips and motherboard wouldn't be able to keep up and could never work with all the fast new peripherals that are coming with USB 3.1/Type C and Thunderbolt 3 - that's 40GB/sec!

New CPU = new MB + new RAM and maybe new GPU = $$$ That's pretty much the path we're on if you're looking for top performance

YouTube/optodata

DS365 | Win11 Pro | Ryzen 9 3950X | RTX 4070 Ti | 32GB RAM | 10TB SSDs | 5K+4K HDR monitors

Canon Vixia GX10 (4K 60p) | HF G30 (HD 60p) | Yi Action+ 4K | 360Fly 4K 360°

Julien Pierre

Contributor Joined: Apr 14, 2011 01:34 Messages: 476 Offline

Oct 11, 2015 04:30

Quote:
The changes these days are too big to just plug in a new CPU and have everything run faster. Even if it were pin-compatible, the supporting chips and motherboard wouldn't be able to keep up and could never work with all the fast new peripherals that are coming with USB 3.1/Type C and Thunderbolt 3 - that's 40GB/sec!

USB 3.1 is just 10 Gbps with a small b - something that even a PCIe 2.0 x2 add-on card can support - I could add one to my computer if I wanted without changing the motherboard.

Thunderbolt 3 would require PCIe 2.0 x8 or PCIe 3.0 x4 . Still not something compelling enough to change the motherboard, IMO, if you have enough PCIe lanes.

Just googled it and looks like Asus is coming up with an PCIE 3.0 x4 Thunderbolt card .

http://www.kitguru.net/peripherals/anton-shilov/asustek-preps-add-in-card-with-thunderbolt-3usb-type-c-ports/

I can't use that one since I don't have PCIE 3.0 on the motherboard though , I would need a PCIE 2.0 x8 card.

I don't really see a compelling reason for Thunderbolt, though. What actual peripherals can you use that currently benefit from this much throughput ? The only thing I could see wanting is faster home networking - I have been running gigabit ethernet for a decade, and it's just not fast enough, IMO, especially not for multi-terabyte backups over the network. I could certainly use 10gigabit wired networking. But 10gige doesn't seem to be featured on any consumer motherboard that I have seen. Now, that would be a compelling reason for me to upgrade.

Of course I would need a handful of 10GigE switches, too.

Still, again, I can buy PCIE 10Gige Ethernet add-on cards today, without requiring a new motherboard or CPU.

New CPU = new MB + new RAM and maybe new GPU = $$$ That's pretty much the path we're on if you're looking for top performance

I'm not looking for the very top performance - I don't usually buy the very latest technology because it tends to be too pricey and not debugged yet. The generation before that tends to be a better value. That would point to buying a Broadwell now that Skylake is out, I guess ? But it seems there is only a 4 core Broadwell, and no upgrade path to future chips at all. So this makes no sense.

I'm just some sort of upgrade path at a reasonable price, that isn't a complete dead-end in the future in terms of upgradability, and there is no obvious one. It seems to me Intel could have made some of their chips pin compatible if they chose to, at least for all those that still supported DDR3. All those sockets differ by just 5 pins, so it certainly seems like they intentionally made them incompatible. All I'm saying it's possible to design things with some level of upgradeability in mind, even if the newest chips might not perform the best (like, not the fastest possible bus throughput, higher number of PCIE lanes, etc) in the old motherboards. But Intel clearly didn't care to try. I guess it's not a big enough market for them.

AMD unfortunately hasn't come up with anything new at all the past 3 years on the desltop other than lower-end APUs which really don't matter for PowerDirector. Very disappointing. But the Intel route is very unappealing.

Perhaps I should start looking at server motherboards with a bunch of CPU sockets and relatively cheap CPUs in them, to ramp up the total number of cores. Not the most power efficient in terms of $/watt, but possibly best performance/$. MSI X99A Raider
Intel i7-5820k @ 4.4 GHz
32GB DDR4 RAM
Gigabyte nVidia GTX 960 4GB
480 GB Patriot Ignite SSD (boot)
2 x 480 GB Sandisk Ultra II SSD (striped)
6 x 1 TB Samsung 860 SSD (striped)

2 x LG 32UD59-B 32" 4K
Asus PB238 23" HD (portrait)