Announcement: Our new CyberLink Feedback Forum has arrived! Please transfer to our new forum to provide your feedback or to start a new discussion. The content on this CyberLink Community forum is now read only, but will continue to be available as a user resource. Thanks!
CyberLink Community Forum
where the experts meet
| Advanced Search >
Hardware Accelerated GPU Scheduling
[Post New]
In the past I was baffeld that there was a bottleneck in the PD encoding process that didn't seem to be related to CPU, GPU, memory or drive. Basically all those were used less than at 100% (so none was maxed out), like some hidden latency was holding the system back.

Well... It seems that the WDDM GPU Scheduler finally got what was promissed at Windows 10 introduction with the update version 2004:

Hardware-accelerated GPU scheduling
With Windows 10 May 2020 update, we are introducing a new GPU scheduler as a user opt-in, but off by default option. With the right hardware and drivers, Windows can now offload most of GPU scheduling to a dedicated GPU-based scheduling processor.

The new GPU scheduler will be supported on recent GPUs that have the necessary hardware, combined with a WDDMv2.7 driver that exposes this support to Windows. Please watch for announcements from our hardware vendor partners on specific GPU generations and driver versions this support will be enabled for.

https://devblogs.microsoft.com/directx/hardware-accelerated-gpu-scheduling





NVIDIA already has this implemented in their latest Studio drivers:



I am curious to see if Cyberlink / Power Director will take advantage of this new feature.
[Thumb - nvidia driver.JPG]
 Filename
nvidia driver.JPG
[Disk]
 Description
 Filesize
130 Kbytes
 Downloaded:
6 time(s)

This message was edited 9 times. Last update was at Jul 06. 2020 14:47

Maliek [Avatar]
Senior Contributor Location: San Antonio, Texas USA Joined: Nov 10, 2012 12:01 Messages: 851 Offline
[Post New]
Quote In the past I was baffeld that there was a bottleneck in the PD encoding process that didn't seem to be related to CPU, GPU, memory or drive. Basically all those were used less than at 100% (so none was maxed out), like some hidden latency was holding the system back.

Well... It seems that the WDDM GPU Scheduler finally got what was promissed at Windows 10 introduction:

Hardware-accelerated GPU scheduling
With Windows 10 May 2020 update, we are introducing a new GPU scheduler as a user opt-in, but off by default option. With the right hardware and drivers, Windows can now offload most of GPU scheduling to a dedicated GPU-based scheduling processor.

The new GPU scheduler will be supported on recent GPUs that have the necessary hardware, combined with a WDDMv2.7 driver that exposes this support to Windows. Please watch for announcements from our hardware vendor partners on specific GPU generations and driver versions this support will be enabled for.

https://devblogs.microsoft.com/directx/hardware-accelerated-gpu-scheduling


NVIDIA already has this implemented in their latest Studio drivers:


I am curious to see if Cyberlink / Power Director will take advantage of this new feature.


Thank you for sharing. Have you tried it yet? I'll be giving it a go later today. Subscribe to PowerDirector University on YouTube.

Subscribe to PDU Mobile on YouTube.
[Post New]
I have enabled it, but I didn't have time to make tests yet.
I am planning to add on the PD timeline, 10 instances of the skateboard video, and encode them at 4K h265 with HA.
With and without this Accelerated scheduler setting selected.

Monitor the CPU and GPU usage in Windows Task Manager > Resource Monitor.
tomasc [Avatar]
Senior Contributor Joined: Aug 25, 2011 12:33 Messages: 6464 Offline
[Post New]
It looks like that I won’t get the version 2004; May 2020 Update to try out. I am still on version 1903. Version 1909 which I don’t have are all the cumulative updates in version 1903 according to this article: https://docs.microsoft.com/en-us/windows/whats-new/whats-new-windows-10-version-2004 . See the attached screenshot.
[Thumb - 200706 Windows Update.jpg]
 Filename
200706 Windows Update.jpg
[Disk]
 Description
Only some win 10 pc qualify for the major updates.
 Filesize
119 Kbytes
 Downloaded:
2 time(s)
[Post New]
I had to force my update from 1909 to 2004:

https://betanews.com/2020/05/29/how-to-force-download-windows-10-may-2020-update/

It was because of my older Creative SB ZxR sound card, and I had to fix that after the 2004 update with a small "hack".

This message was edited 1 time. Last update was at Jul 06. 2020 14:44

[Post New]
With GPU accelerated scheduler turned "ON", the job finished in 1:35 min. H.265, 4K 3840x2160/30p (37Mbps).
Same results with setting "OFF", so this is not used in PD. Note that this feature requires that apps use DX12.

The NVENC on my 1080 was used at 99% (GPU cores at 6-10%, CPU at 16%), so the ASIC core was the bottleneck, and this wasn't my experience before...NVENC was usually around 65%.
So I don't know who to blame. Windows ver 2004 is more optimized? NVIDIA drivers?

[Thumb - Capture.JPG]
 Filename
Capture.JPG
[Disk]
 Description
 Filesize
558 Kbytes
 Downloaded:
1 time(s)

This message was edited 5 times. Last update was at Jul 06. 2020 15:37

[Post New]
And a video testing this: https://youtu.be/wlrWDb1pKXg
optodata
Senior Contributor Location: California, USA Joined: Sep 16, 2011 16:04 Messages: 8630 Offline
[Post New]
I just ran some tests based on your HEVC 4K 10x skateboard project, but I also tested my UHD Graphics 630 and also produced to HD. I ran PD's GPU Optimization tool twice after enabling GPU Scheduling - once with the iGPU available and again after selecting the RTX 2070.

I saw no difference in either of the HEVC 4K profiles, but there was a slight improvement when producing to HEVC HD with both kinds of hardware:

Output GPU Scheduling Time
4K RTX 2070 off 2:01
4K RTX 2070 on 2:00
4K UHD 630 off 3:44
4K UHD 630 on 3:43
HD RTX 2070 off 0:32
HD RTX 2070 on 0:27
HD UHD 630 off 1:14
HD UHD 630 on 1:11

I've also ran the same tests with AVC, but there really isn't any difference other than a slight penalty for enabling GPU Scheduling:

Output GPU Scheduling Time
4K RTX 2070 off 1:03
4K RTX 2070 on 1:04
4K UHD 630 off 1:51
4K UHD 630 on 1:53
HD RTX 2070 off 0:27
HD RTX 2070 on 0:28
HD UHD 630 off 0:36
HD UHD 630 on 0:37

I have screenshots and the project I used in this OneDrive folder if anyone's interested. All HEVC runs were GPU-bound (98%-100% usage) with the CPU around 30-40%. None of the AVC tests got above 85% GPU usage while the CPU was between 35-70%.

Maybe GPU Scheduling is more tightly bound to CUDA cores and other game-centric features rather than to NVENC and QuickSync, and maybe it will help speed up some other edits that use GPU-enhanced FX or the AI impressions.

This message was edited 2 times. Last update was at Jul 07. 2020 15:56



YouTube/optodata


DS365 | Win11 Pro | Ryzen 9 3950X | RTX 4070 Ti | 32GB RAM | 10TB SSDs | 5K+4K HDR monitors

Canon Vixia GX10 (4K 60p) | HF G30 (HD 60p) | Yi Action+ 4K | 360Fly 4K 360°
[Post New]
Thanks for the testing.
Yep, for now there are minimal to none improvements, and even MS says not to expect miracles.
This is like a building block for future improvements.
pmikep [Avatar]
Senior Member Joined: Nov 26, 2016 22:51 Messages: 285 Offline
[Post New]
Thanks, guys, for testing this and for your results. Saved me the hassle of updating Windows ... for now.
optodata
Senior Contributor Location: California, USA Joined: Sep 16, 2011 16:04 Messages: 8630 Offline
[Post New]
I added AVC testing to my previous post (there's a slight penalty for using GPU Scheduling in all profiles), and for fun I tried out the AI Impressionist effect with CUDA as described in this post.

OpenVINO isn't supported so there was no change in transformation times or iGPU usage, but the RTX usage jumped from 24% to 99% with GPU scheduling on, and the transform time dropped from 0:26 to 0:20 for the 15 sec skateboard clip.


As far as my limited testing is concerned, it seems like systems with supported nVidia hardware will see the most benefit when using GPU Scheduling with the AI packs (and CUDA processing); followed by a small improvement when producing with NVENC to HEVC HD; and very little change when producing to HEVC 4K or to any AVC profile.

This message was edited 1 time. Last update was at Jul 07. 2020 17:09



YouTube/optodata


DS365 | Win11 Pro | Ryzen 9 3950X | RTX 4070 Ti | 32GB RAM | 10TB SSDs | 5K+4K HDR monitors

Canon Vixia GX10 (4K 60p) | HF G30 (HD 60p) | Yi Action+ 4K | 360Fly 4K 360°
[Post New]
Quote
OpenVINO isn't supported so there was no change in transformation times or iGPU usage, but the RTX usage jumped from 24% to 99% with GPU scheduling on, and the transform time dropped from 0:26 to 0:20 for the 15 sec skateboard clip.




That's great! I know that this is a niche usage (CUDA cores used to do AI) but in the future this can be used for many things that would be prohibitive time-wise today...

This message was edited 1 time. Last update was at Jul 08. 2020 10:20

Maliek [Avatar]
Senior Contributor Location: San Antonio, Texas USA Joined: Nov 10, 2012 12:01 Messages: 851 Offline
[Post New]
Thank you for conducting these tests. Great to see the (minor and major) improvements. Can't wait to see future enhancements from Windows and graphics card manufacturers to take full advantage of this feature.

This message was edited 1 time. Last update was at Jul 11. 2020 11:33

Subscribe to PowerDirector University on YouTube.

Subscribe to PDU Mobile on YouTube.
Powered by JForum 2.1.8 © JForum Team