Announcement: Our new CyberLink Feedback Forum has arrived! Please transfer to our new forum to provide your feedback or to start a new discussion. The content on this CyberLink Community forum is now read only, but will continue to be available as a user resource. Thanks!
CyberLink Community Forum
where the experts meet
| Advanced Search >
UPDATE (and a bit of demystifying, as well 8:

Finally got my Nvidia (Kepler) 4000 locked-and-loaded. Installed, booted and configured in a breeze. Fired-it up, tuned it a bit on NVidia's Control Panel, and this is what I got:


1. CPU-based encoding of [DV-AVI (25Mbps)] => [H.264 (AVCHD) pcm audio] 2mins-clip coded in ~10secs. (same)
2. GPU-based encoding of [DV-AVI (25Mbps)] => [H.264 (AVCHD) pcm audio] 2mins-clip coded in ~9secs. (down from 13!)
3. CPU-based encoding of [1080i QAM capture] => [H.264 (AVCHD) Dolby5.1] 2mins-clip coded in ~40secs. (down from 60secs!)
4. GPU-based encoding of [1080i QAM capture] => [H.264 (AVCHD) Dolby5.1] 2mins-clip coded in ~39secs. (down from 75secs!)

NOTES:

1. GPU-based encoding fires-up TWELVE (12) threads, with mid-low total CPU usage. (~35%)
2. CPU-based encoding fires-up TWENTY-FOUR (24) threads, with mid-high total CPU usage (~65%)
3. Cinebench GPU (OpenGL) scores virtually DOUBLED with the K4000.

BOTTOM-LINE:

1. The Nvidia Kepler-4000 is quite a boost in performance.
2. The K4000 substantially *reduced* CUDA encoding times, in both SD or HD clips.
3. The K4000 also helped improve CPU-based encoding times (?). This was not expected, which means that there may be some work still being done by Nvidia, even if "HW-accel" is checked-off (e.g. the CPU keeps waiting along the line).
4. Based on thread-usage and bandwidth of CPU-based load, it seems that there is STILL room for improvement, if the card was faster (CPU is not yet fully used).
5. Powerdirector DOES NOT seem (yet) to be using K4000's NVENC. I do wonder how much extra improvement could be attained.

Just sharing some interesting findings, and still room for improvement!
Thanks!

In my case, only TWELVE (12) THREADS fire-up, instead of the TWENTY-FOUR (24) available. This equates to 28%-30% cpu usage, at best, because the twelve threads that fire, turn out to be not saturated / exhausted from available processing bandwidth.

In contrast, CineBench fires up all 24 available "cylinders" and it just quite a scene to watch those twin Xeons chewing the benchmark, with both NUMA nodes reporting FULL steam (95% to 100% use)

Therefore, in my setup (with twin X5660s), PD12 is not using all available fire-power. Now, this does not necessarily means that it has to do it (there are some specific types of task that require multi-core optimization, and sometimes using less core or threads means better performance, as I have seen in other similar apps).

This is a question for PD12's development team to answer (maybe my processors are not actually supported?)
...Well, this issue may turn out to be a bit more elusive, though.

It could be the combined result of PD12 + GFX drivers + Windows components (hard to tell, at this point).

Attached are snap-shots of how my properties (right-mouse click) look like, in Windows, so anyone can see how they come out.

The fact that my twin PS3s fail to play the CPU-based version (the one showing incomplete in Win-properties) makes me think that there is actually something missing on that output-clip (the GPU-based work just fine).

FORGOT: besides the video-profile shown above, the AUDIO is LinearPCM, 1500+ kbps, at full resolution (as available in PD12).

A quick update here:

Simply cannot find the info. I need, so I ordered a Quadro K4000, which is expected to arrive pretty soon.

Will be installing, tuning and testing, and will come back with some results.

I CANNOT, however, determine yet why/how PD12 is only using 12-threads, instead of the 24-available (it may be a purposeful multi-core optimization driven by law of diminishing returns and other testing, for instance, but I do not see all cylinders firing up, I have to say).

Stay tuned!
Folks:

THANKS for the replies and ideas.

I have opened a case, and provided requested info. to PD12's support folks (they seem pretty responsive, so far).

To replicate this issue, you will not need any special info., tools, tricks, etc. (in my workstation / setup). Just take a 2min .AVI (SD-DV) clip, create a custom profile (AVCHD, 4Mbps, Quality=6, Deblocking, HighProfile, CABAC, 720x480i, 29.97 f/sec), and then produce WITH HW-assisted acceleration and WITHOUT it.

Then open Windows properties of each version, and watch how frames/sec info. is missing in CPU-based version. (this one, WILL NOT play on PS3).

Seems like a minor problem (in the big scheme of things), but... renders files unusable with my home-theater's, jack-of-all-trades PS3 media-player / scaler.

Cheers!
Howdy!

It seems PD12's .m2ts files (produced WITHOUT HW-acceleration enabled) have MISSING meta-data (e.g. frames/rate, etc.), whereas PD12's .m2ts files produced WITH HW-acceleration DO carry such information (!).

Sample clip is a 2mins NTSC DV-AVI (720x480i) captured from Sony DSR-11 pro mini-deck. On PD12, Custom H264 AVCHD profile (4Mbps, High-profile, max. quality, de-blocking, Linear PCM audio).

I noticed the problem when trying to play files on PS3 player: .m2ts file with complete meta-data played PERFECTLY, while the ones with missing data DID NOT. Missing stuff could be verified right in Windows (but not sure what else is missing besides frame-rate).

Hope this can be fixed (should be easy).

Howdy!

I've been with PD10 for sometime, and have now upgraded to PD12 (so far, seems pretty good). My first post, so (please) be gentle... 8

Currently running a HP Z800 workstation, with twin Xeons X5660 (2 NUMA nodes), 24GB of ECC/Buffered RAM and Nvidia FX3800, with freshest drivers I could find during last 48hrs (to put it simply).

Some preliminar tests:

1. CPU-based encoding of [DV-AVI (25Mbps)] => [H.264 (AVCHD) pcm audio] yields 4.5-6.0 real-time mins. per 60mins-footage.
2. GPU-based encoding of [DV-AVI (25Mbps)] => [H.264 (AVCHD) pcm audio] yields 6.0-7.5 real-time mins. per 60mins-footage.
3. CPU-based encoding of [1080i QAM capture] => [H.264 (AVCHD) Dolby5.1] yields ~30mins real-time mins. per 60mins-capture.
4. GPU-based encoding of [1080i QAM capture] => [H.264 (AVCHD) Dolby5.1] yields ~35mins real-time mins. per 60mins-capture.
(still working on 1080i testing, so take #3 and #4 with grain-of-salt).

Now, some questions:

1. PD12's CPU-based enconding above DOES NOT use 24 logical cores. Only 12-cores seem active. Why? Is this a purposeful multi-core optimization? Is the FX3800 still being used for something, even HW-enconding is unchecked in Production mode / tab?

2. Will upgrading to Quadro Kepler-4000 yield in better H264 performance (*CUDA*-based)? In other words, would increasing from 192 to 768 CUDA cores end-up in improved H264 ENCODING performance? Or there is no gain expected for this task?

3. When will Kepler-4000's NVENC HW-based tuned-encoder be supported in PD12?

MANY thanks, in advance, for your responses!

Go to:   
Powered by JForum 2.1.8 © JForum Team